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Abstract 

This  paper  systematically  analyzes  the  observational  learning  paradigm  of 
Banerjee  (1992)  and  Bikhchandani,  Hirshleifer  and  Welch  (1992).  We  first 
relax  the  informational  assumption  that  is  the  linchpin  of  the  "herding'  re- 
sults, namely  that  individuals'  private  signals  are  uniformly  bounded  in  their 
strength.  We  then  investigate  the  model  with  heterogeneous  preferences,  and 
discover  that  a  'twin'  observational  pathology  generically  appears:  Optimal 
learning  may  well  lead  to  a  situation  where  no  one  can  draw  any  inference 
at  all  from  history!  We  also  point  out  that  counterintuitively.  even  with  a 
constant  background  "noise"  induced  by  trembling  or  crazy  individuals,  public 
beliefs  generically  converge  to  the  true  state  of  the  world. 

All  results  are  cast  within  a  simple  dynamic  mathematical  framework  that 
is  (i)  rich  enough  to  describe  a  rich  array  of  observational  learning  dynamics; 
and  (ii)  amenable  to  economic  modifications  that  hinder  or  abet  informational 
transmission,  and  sometimes  permit  full  belief  convergence  to  occur. 
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1.  INTRODUCTION 

Suppose  that  a  countable  number  of  individuals  each  must  make  a  once-in-a-lifetime 
binary  decision1  —  encumbered  solely  by  uncertainty  about  the  state  of  the  world.  As- 
sume that  preferences  are  identical,  and  that  there  are  no  congestion  effects  or  network 
externalities  from  acting  alike.  Then  in  a  world  of  complete  and  symmetric  information, 
all  would  ideally  wish  to  make  the  same  decision. 

But  life  is  more  complicated  than  that.  Assume  instead  that  the  individuals  must  decide 
sequentially,  all  in  some  preordained  order.  Suppose  that  each  individual  may  condition 
his  decision  both  on  his  (endowed)  private  signal  about  the  state  of  the  world  and  on  all 
his  predecessors"  decisions,  but  not  their  private  signals.  The  above  simple  framework  was 
independently  introduced  in  Banerjee  (1992)  and  Bikhchandani  et  al.  (1992)  (hereafter 
denoted  BHW).  Their  perhaps  surprising  common  conclusion  was  that  with  actions  and 
not  signals  publicly  observable,  there  was  a  positive  chance  that  a  'herd'  would  eventually 
arise:  Everyone  after  some  period  would  make  the  identical  less  profitable  decision. 

This  is  a  compelling  "pathological'  result:  Individuals  do  not  eventually  learning  the 
true  state  of  the  world  despite  the  more  than  sufficient  wealth  of  information.  So  let's 
seriously  analyze  the  model  whence  it  arises.  For  we  believe  that  learning  from  others' 
actions  is  economically  important,  and  perhaps  encompasses  the  greater  part  of  individ- 
ual information  acquisition  that  occurs  in  society  at  large.  As  such,  from  a  theoretical 
standpoint,  it  merits  the  scrutiny  long  afforded  the  single-person  learning  results,  and  the 
rational  expectations  literature. 

In  this  paper,  we  attempt  a  systematic  analysis  of  the  above  observational  learning 
paradigm  on  two  fronts:  First,  we  develop  a  simple  dynamic  mathematical  framework 
rich  enough  to  describe  a  rich  array  of  observational  learning  dynamics.  This  offers  key 
insights  into  the  probabilistic  foundations  of  observational  learning,  and  that  allows  us  to 
relatively  painlessly  generalize  the  economics  at  play.  Second,  we  refer  to  the  results  of 
Banerjee  (1992)  and  BHW  as  the  standard  herding  story,  and  proceed  to  spin  alternative 
more  economic  stories  which  question  the  robustness  of  their  pointed  conclusion.  We  in 
fact  find  that  herding  is  not  the  only  possible  'pathological'  outcome.  For  not  only  is  it 
possible  that  all  individuals  may  almost  surely  end  up  taking  the  correct  action,  but  under 
just  as  plausible  conditions,  social  dynamics  may  well  converge  to  a  situation  where  no  one 
can  draw  any  inference  at  all  from  history!  We  relate  this  'confounded  learning'  outcome 
to  a  result  due  to  McLennan  (1984)  from  the  single-person  experimentation  literature.  We 
then  argue  that  the  twin  pathologies  of  herding  and  confounded  learning  are  essentially 
the  only  possible  ways  in  which  individuals  eventually  fail  to  learn  the  truth. 

On  Herding  and  'Cascades' 

While  'herding'  certainly  has  a  negative  connotation,  BHW  in  fact  pointed  out  that 
everyone  would  almost  surely  eventually  settle  on  a  common  decision.  For  this  reason, 
they  introduced  the  arguably  more  colorful  terminology  of  a  cascade,  referring  to  any 
such  infinite  train  of  individuals  who  decide  to  act  irrespective  of  the  content  of  their 


1  Economic  examples  include  whether  to  invest  in  a  newly  opening  and  soon  to  be  closing  market.  A 
historical  instance  might  have  been  the  decision  to  enjoy  the  maiden  voyage  of  the  Titarnj. 


signal.  While  we  find  this  terminology  useful  and  shall  adopt  it.  we  still  succomb  to  the 
herd  in  denoting  these  works  and  those  that  followed  them  the  'herding*  literature.  The 
common  thread  linking  these  papers  is  the  herding  "pathology"  that  arises  under  sequential 
decision-making  when  actions  and  not  information  signals  of  previous  decision  makers  are 
observable.  2 

This  definition  indirectly  rules  out  the  model  of  Lee  (1993).  who  allows  for  continuous 
action  spaces  which  together  with  a  continuous  payoff  function  can  perfectly  reveal  (at 
least  a  range  of)  private  signals!3  That  no  herding  arises  in  Lee*s  context  should  come  as 
little  surprise  —  at  least  to  those  that  read  this  paper.  Indeed,  herding  pathologies  were 
absent  from  the  rational  expectations  literature  for  this  very  reason.4  since  the  Walrasian 
price  can  adjust  continuously.0  As  a  result,  with  one-dimensional  information  at  least, 
marginal  changes  in  individuals'  private  signals  all  have  impact  on  the  publicly-observed 
price.  It  is  necessary  in  some  sense  that  the  entry  of  new  information  be  endogenouslv 
"lumpy"  for  herding  to  occur  (so  that  it  can  eventually  be  choked  off  altogether). 

A  Tour  of  the  Paper 

We  first  focus  on  the  key  role  played  by  the  informational  assumptions  underlying  the 
standard  story.  A  crucial  element  is  that  the  individuals  cannot  get  arbitrarily  strong  pri- 
vate signals,  so  that  their  private  likelihood  ratio  is  bounded  away  from  zero  and  infinity. 
For  in  that  case,  a  finite  history  can  provide  such  a  strong  signal,  that  even  the  most 
doctrinaire  individual  dare  not  quarrel  with  its  conclusion.  When  this  "bounded  beliefs'' 
assumption  is  relaxed,  incorrect  herds  cannot  arise  —  and  in  fact,  eventually  all  individuals 
will  make  the  correct  choice.6  That  result  is  an  application  of  the  Borel-Cantelli  Lemma, 
since  if  individuals  were  herding  on  a  wrong  action,  then  there  would  with  probability  one 
appear  an  individual  with  so  strong  a  private  belief  that  he  would  take  another  action, 
thereby  revealing  his  very  strong  information  and  overturning  the  herd.  Indeed,  casual 
empiricism  suggests  that  individuals  who  are  arbitrarily  tenacious  in  their  beliefs  do  exist. 
But  this  assumption  is  largely  a  modelling  decision,  and  therein  lies  its  ultimate  justifica- 
tion. For  it  provides  us  with  a  richer  model  than  possible  in  the  standard  story,  allowing 
us  to  consider  natural  economic  modifications  that  hinder  informational  transmission,  and 
ask  if  convergence  still  (almost  surely)  occurs.  This  natural  approach  to  robustness  was 
simply  not  possible  in  the  framework  of  Banerjee  (1992)  and  BHW. 

That  a  single  individual  can  'overturn  the  herd'  turns  out  to  be  the  key  insight  into  the 
nonexistence  of  herding  with  unbounded  beliefs.  So  our  first  key  economic  innovation  is 
to  prevent  this  from  happening,  and  introduce  noise  into  the  model.  Counterintuitively, 


2It  is  noteworthy  that  by  this  definition,  herding  was  discovered  in  some  veiled  form  in  a  concluding 
example  in  Jovanovic  (1987). 

3Contrast,  for  a  moment  Banerjee's  (1992)  model  which  had  a  continuous  action  space  but  a  discon- 
tinuous payoff  function. 

4 To  be  perfectly  clear,  we  are  referring  to  the  literature  on  dynamic  price  formation  under  incomplete 
information.  For  a  good  take  on  this  field,  see  Bray  and  Kreps  (1987). 

5 This  assumes  that  individuals  can  continuously  adjust  their  trading  quantities.  If  not,  Avery  and 
Zemsky  (1992)  have  shown  that  a  temporary  herd  may  arise. 

6This  idea,  not  as  cleanly  expressed,  was  introduced  in  Smith  (1991),  which  this  paper  and  Smith  and 
Sorensen  (1994)  now  supersede. 


even  with  a  constant  inflow  of  crazy  individuals,  we  still  find  that  learning  is  complete 
in  the  sense  that  everyone  (sane)  eventually  learns  the  true  state  of  the  world.  We  then 
turn  to  a  parallel  reason  why  the  actions  of  isolated  individuals  need  not  matter,  namelv 
multiple  individuals'  types.  That  is,  we  relax  the  assumption  that  all  individuals  have  the 
same  -preferences.  This  obviously  was  a  crucial  underlying  tenet  for  the  herding  results 
of  Banerjee  (1992)  and  BHW.  For  instance,  suppose  on  a  highway  under  construction, 
depending  on  how  the  detours  are  arranged,  that  those  going  to  Chicago  should  merge 
right  (resp.  left),  with  the  opposite  for  those  headed  toward  St.  Louis.  If  one  knows  that 
roughly  75%  are  headed  toward  Chicago,  then  absent  any  strong  signal  to  the  contrary, 
those  headed  toward  St.  Louis  should  take  the  lane  'less  traveled  by'.  That  the  resulting 
dynamics  might  converge  to  a  totally  uninformative  inference  even  with  arbitrarily  strong 
private  signals  is  the  surprising  content  of  Theorem  8. 

We  conclude  with  a  brief  discussion  of  costly  information  and  payoff  externalities.  We  do 
not  study  endogenous  timing,  as  we  have  little  to  add  to  the  recent  findings  of  Chamley  and 
Gale  (1992).  In  a  separate  more  involved  work  in  progress,  we  investigate  what  happens 
when  individuals  do  not  perfectly  observe  the  order  of  previous  individuals'  moves.  This 
is  yet  another  (more  typical)  reason  for  why  contrary  actions  of  isolated  individuals  might 
have  very  little  effect.  Lmfortunately.  standard  martingale  results  cannot  be  applied,  and 
therefore  it  falls  outside  the  scope  of  this  paper. 

Identifying  the  appropriate  stochastic  processes  that  are  martingales  turns  out  to  have 
been  a  crucial  step  in  our  analysis.  The  essential  analytics  of  the  paper  build  on  the  fact 
that  public  likelihood  ratio  is  (conditional  on  the  state)  a  martingale  and  a  homogeneous 
Markov  chain.  The  Markovian  aspect  of  the  dynamics  allows  us  (just  as  it  did  Futia 
(1982))  to  drastically  narrow  the  range  of  possible  long  run  outcomes,  as  we  need  only 
focus  on  the  ergodic  set.  This  set  is  wholly  unrelated  to  initial  conditions,  and  depends 
only  on  the  transition  dynamics  of  the  model.  By  contrast,  the  Martingale  property  of 
the  model  —  which  is  unavailable  in  Futia  (1982)  —  affords  us  a  different  glimpse  into 
the  long  run  dynamics,  tying  them  down  to  the  initial  conditions  in  expectation.  As  it 
turns  out.  this  allows  us  to  eliminate  from  consideration  the  not  implausible  elements  of 
the  ergodic  set  where  everyone  entertains  entirely  false  beliefs  in  the  long  run. 

Section  2  outlines  the  basic  mathematical  framework  within  which  we  are  operating. 
Section  3  takes  a  brief  mathematical  detour,  developing  some  key  generic  insights  on  the 
underlying  probabilistic  dynamics.  We  return  to  the  characterization  of  when  herding  oc- 
curs in  section  4,  and  explore  the  robustness  in  sections  5,  6,  7.  An  appendix,  among  other 
things,  derives  some  new  results  on  the  local  stability  of  stochastic  difference  equations. 
This  result,  whose  absence  from  the  literature  (to  our  knowledge!)  greatly  surprised  us, 
ought  to  prove  widely  applicable  across  economics  and  the  mathematical  (social)  sciences 
in  general. 


2.  THE  STANDARD  MODEL 

2.1   Some  Notation 

We  first  introduce  a  background  probability  space  (Q.£,v).  This  space  underlies  all 
random  processes  in  the  model,  and  is  assumed  to  be  common  knowledge. 

An  infinite  sequence  of  individuals  n  =  1.2,...  sequentially  takes  actions  in  that  exoge- 
nous order.  Individuals  observe  the  actions  of  all  predecessors.  There  are  two  states  of 
the  world  (or  more  simply,  states),  labelled  H  ('high")  and  L  ('low').  Formally,  this  means 
that  the  background  state  space  Q  is  partitioned  into  two  events  QH  and  QL ,  called  H 
and  L.'  The  results  derived  below  will  go  through  with  any  finite  number  of  states,  but 
the  notation  becomes  considerably  harder.  Therefore  we  stick  to  the  two  states  case,  and 
later  explain  carefully  how  we  can  modify  the  analysis  to  more  states.  Let  the  common 
prior  belief  be  that  v{H)  =  u(L)  =  1/2.  That  individuals  have  common  priors  is  a  stan- 
dard modelling  assumption,  see  e.g.  Harsanyi  (1967-68).  Also,  a  flat  prior  over  states  is 
truly  WLOG.  for  it  will  turn  out  that  more  general  priors  will  be  formally  equivalent  to  a 
renormalization  of  the  payoffs,  as  seen  in  section  2.2  below. 

Individual  n  receives  a  private  random  signal,  o~n  £  E,  about  the  state  of  the  world. 
Conditional  on  the  state,  the  signals  are  assumed  to  be  i.i.d.  across  individuals.  It  is 
common  knowledge  that  in  state  H  (resp.  state  L),  the  signal  is  distributed  according 
to  the  probability  measure  p.H  (resp.  fxL).  Formally,  we  mean  that  an  :  Q  — >  E  is  a 
stochastic  variable,  and  /j,H  =  vH  o  a~l  and  \iL  =  vL  o  cr"1,  where  vH  (resp.  uL)  is  the 
measure  u  conditioned  on  the  event  QH  (resp.  QL).  To  ensure  that  no  signal  will  perfectly 
reveal  the  state  of  the  world,  we  shall  insist  that  nH  and  nL  be  mutually  absolutely 
continuous.8  Consequently,  there  exists  a  positive  and  finite  Radon-Nikodym  derivative 
g  =  dfj.H  jd[iL  :  I!  — ¥  (O.oc)  of  (xH  w.r.t.  nL .  And  to  avoid  trivialities,  we  shall  rule  out 
g  =  1  almost  surely,9  so  that  /iH  and  fiL  are  not  the  same  measure;  this  will  ensure  that 
some  signals  are  informative  about  the  state  of  the  world. 

Using  Bayes'  rule,  the  individual  arrives  at  what  we  shall  refer  to  as  his  private  belief 
p{a)  =  g(a)/(g(a)  +  l)  €  (0, 1)  that  the  state  is  H.  Conditional  on  the  state,  private  beliefs 
are  i.i.d.  across  individuals  because  signals  are.  In  state  H  (resp.  state  L),  p  is  distributed 
with  a  c.d.f.  FH  (resp.  c.d.f.  FL)  on  (0, 1).  The  distributions  FH  and  FL  are  subtly  linked. 
In  Appendix  A,  we  prove  among  other  things  that  FH  and  FL  have  the  same  support,10 
and  that  FL-FH  increases  (weakly)  on  [0, 1/2]  and  decreases  (weakly)  on  [1/2, 1].  Denote 
the  common  support  of  FH  and  FL  by  supp(F).  The  structure  of  supp(F)  will  play  a 
major  role  in  the  definition  of  herds.  Observe  that  the  common  support  of  FH  and  FL, 
which  we  shall  denote  supp(F),  coincides  with  the  range  of  p(-)  on  E.11    It  is  therefore 


7  For  later  reference,  refer  to  the  restricted  sigma  fields  as  £ H  and  £L ,  respectively. 

"Recall  that  nH  is  absolutely  continuous  w.r.t.  \iL  if  (jll(S)  =  0  =>  n" (S)  =  0  VS  £  S,  where  5  is 
the  CT-algebra  on  I.  By  the  Radon-Nikodym  Theorem,  there  exists  then  a  unique  g  €  L1(ijll)  such  that 
M"(S)  =  fsgdnL  for  every  5  €  5.  See  Rudin  (1987). 

9Note  that  with  \xH  and  fiL  mutually  a.c,  "almost  sure"  assertions  are  well-defined  without  specifying 
which  measure. 

10Recall  that  the  support  of  a  probability  measure  is  any  measurable  set  accorded  probability  1.  But 
throughout  the  paper,  'the'  support  is  well-defined  modulo  measure  zero  equivalence.     -  ^ 

1 '  While  the  Radon-Nikodym  derivative  g  is  only  determined  with  probability  one,  we  canselect  a  version 


important  to  observe  that  the  underlying  results  are  ultimately  driven  by  the  probability 
measures  nH  and  nL,  which  are  the  primitive  of  the  model. 

By  construction,  co(supp(F))  =  [6,6]  Q  [0. 1]  with  0  <  6  <  6  <  l.12  We  shall  say  that 
the  private  beliefs  are  bounded  if  0  <  6  <  6  <  1.  If  co(supp(F))  =  [0, 1],  we  simply  call  the 
private  beliefs  unbounded. 

Each  individual  can  choose  from  a  finite  set  of  actions  (amim  G  M),  where  M   = 

{1 M}-  Action  am  has  a  (common)  payoff  uH (am)  in  state  H  and  uL{am)  in  state  L. 

The  objective  of  the  individual  is  to  take  the  action  that  maximizes  his  expected  payoff. 
We  assume  WLOG  that  no  action  is  weakly  dominated  (by  all  other  actions),  and  to  avoid 
trivialities  we  insist  that  at  least  two  such  undominated  actions  exist.  Before  deciding  upon 
an  action,  the  individual  can  observe  the  entire  action  history.  We  shall  loosely  denote  the 
action  profile  of  any  finite  number  of  individuals  as  h.  Exactly  how  the  individual  uses 
that  history  is  considered  in  the  next  subsection. 

2.2  Preliminary  Results 

Action  Choice 

Given  a  posterior  belief  r  6  (0, 1)  that  the  state  is  H,  the  expected  payoff  of  action  a 
is  ruH(a)  +  (1  —  r)uL(a).  Figure  1  portrays  the  content  of  the  next  result. 

Lemma  1  The  interval  (0, 1)  partitions  into  relatively  closed  subintervals  I\, . . . ,  Im  over- 
lapping at  endpoints  only,  such  that  action  am  is  optimal  when  the  posterior  r  6  Im. 

Proof:  As  noted,  the  payoff  of  each  action  is  a  linear  function  of  r.  Hence,  because  by 
the  assumption  that  action  am  is  strictly  best  for  some  r,  there  must  be  a  single  open 
subinterval  of  (0. 1)  where  action  am  strictly  dominates  all  other  actions.  That  this  is 
a  partition  follows  from  the  fact  that  there  exists  at  least  one  optimal  action  for  each 
posterior  r  E  (0,1).  0 

We  now  WLOG  strictly  order  the  actions  such  that  am  is  optimal  exactly  when  the 
posterior  r  €  [fm-i,fm]  =  Im,  where  0  =  f0  <  fi  <  •  •  •  <  f\f  =  1.  Let  us  further  introduce 
the  tie-breaking  rule  that  individuals  take  action  am,  and  not  am+i,  whenever  r  =  fm. 
Note  that  action  a  a,/  (resp.  ax)  is  optimal  when  one  is  certain  that  the  state  is  H  (resp.  L). 
Indeed,  perfect  information  leads  one  to  take  the  correct  action,  and  with  more  actions 
than  states,  we  might  think  of  it  as  one  of  the  'extreme'  actions;  however,  as  decisions  are 
generally  taken  without  the  luxury  of  such  focused  beliefs,  an  'insurance'  action  may  well 
be  chosen. 

We  can  now  see  how  unfair  priors  are  equivalent  to  a  simple  payoff  renormalization,  as 
asserted  earlier.  For  the  characterization  in  Lemma  1  is  still  valid,  since  reference  is  only 
made  to  the  posterior  beliefs;  moreover,  the  key  defining  indifference  relation  fmuH(am)  + 
(1  -  fm)uL{am)  =  fmuH(am+i)  +  (1  -  fm)uL{am+i)  implies  that 

1  -  fm       u"{am)  -  uH(am+l) 

posterior  odds  =  — z =  —n — ^ T7 T 

rm  uL{am)  -  uL{am+i) 


of  it  such  that  the  above  holds. 

'-'Here,  co(.4)  denotes  the  convex  hull  of  the  set  A. 


r2  r3     r4  r5= 


Figure  1:  Example  Payoff  Frontier.  The  diagram  depicts  the  payoff  of  each  of  five 
actions  as  a  function  of  the  posterior  belief  r  that  the  state  is  H.  The  individual  simply 
chooses  the  action  yielding  the  highest  payoff. 

Since  unfair  priors  merely  serve  to  multiply  the  posterior  odds  by  a  common  constant,  the 
thresholds  f0, . . . ,  f M  are  all  unchanged  if  we  merely  multiply  all  payoffs  in  state  H  by  the 
same  constant. 


Individual  Learning 

We  now  consider  how  an  individual's  optimal  decision  rule  incorporates  the  observed 
action  history  and  his  own  private  belief.  In  so  doing,  we  could  proceed  inductively,  and 
first  derive  the  first  individual's  decision  rule  as  a  function  of  his  private  belief;  next,  we 
could  describe  how  the  second  individual  bases  his  decision  on  the  private  belief  and  on 
the  first  individual's  action,  and  so  on.  Instead,  we  shall  collapse  this  reasoning  processes, 
and  simply  say  that  individual  n  knows  the  decision  rules  of  all  the  previous  agents,  and 
acts  accordingly.  He  can  use  the  common  prior  to  calculate  the  ex  ante  (that  is,  as  of 
time-0)  probability  of  any  action  profile  h  in  either  of  the  two  states.  We  shall  denote 
these  probabilities  by  7rH(h)  and  irL{h),  and  let  the  resulting  likelihood  ratio  that  the  state 
is  L  (that  is,  low  and  not  high)  be  £(h)  =  irL(h)/irH{h).  Similarly,  let  q(h)  be  the  public 
belief  that  the  state  is  H,  i.e. 


q(h)  = 


n*(h) 


irH(h)+irL(h) 


=  l/(l  +  e(h)). 


Think  of  q(h)  as  the  belief  an  individual  facing  the  history  h  would  entertain  if  he  had  a 
purely  neutral  private  belief.  Given  the  one-to-one  relationship  between  q  and  £  =  (l—q)/q, 
we  may  also  loosely  refer  to  the  likelihood  ratio  as  the  public  belief. 


A  final  application  of  Bayes  rule  yields  the  posterior  belief  r  (that  the  state  is  H)  in 
terms  of  the  public  signal  —  or  equivalently  the  likelihood  ratio  ((h)  —  and  the  private 
belief  p: 

pirH(h)  p  1 


r  = 


P  7TH(h)  +  (1  -  P)   7TL(h)  P  +  (1  -  p)   £(h)    "    1  +   V-^£(h) 


Lemma  2  (Private  Belief  Thresholds)  After  history  h  is  observed,  there  exist  thresh- 
old values  pm(h)  G  (0,  1),  such  that  am  is  chosen  exactly  when  the  private  belief  satisfies 
p  G  (pm-\{h),pm{h)],  where  pM(h)  =  1  and  for  all  m  =  0 M  -  1. 

Pm(h)  fm   -t{h)  (2) 


1  -  Pm(h)  1  -  fr 

The  proof  is  simple.  The  thresholds  simply  come  from  a  well-known  reformulation  of  (1) 
as  posterior  odds  (1  —  r)/r  equal  the  private  odds  (1  —p)jp  times  the  likelihood  ratio  ((h). 
The  strict  inequalities  are  consequences  of  the  tie-breaking  rule,  and  the  fact  that  (1)  is 
strictly  increasing  in  p. 

Observe  that  corresponding  to  f0(h)  =  0  and  fM(h)  =  1,  we  have  po(h)  =  0  and 
P.\f(h)  =  1  after  any  history  h.  Later,  when  referring  to  (2)  and  elsewhere,  we  shall 
suppress  the  explicit  dependence  of  £(h)  on  h  whenever  convenient,  and  write  pm(£)  instead 
of  pm(h).  This  is  not  entirely  unjustified  because  the  likelihood  ratio  is  a  sufficient  statistic 
for  the  history.  Written  as  such,  £  •->•  pm(£)  is  an  increasing  function. 

Corporate  Learning 

We. shall  denote  the  likelihood  ratio  and  public  belief  confronting  individual  n  as  £n  and 
qn.  respectively.13  Since  the  first  agent  has  not  observed  any  history,  we  shall  normalize 
£i  =  1.  As  signals,  and  thereby  actions,  are  random,  the  likelihood  ratio  (£n)^=l  and 
public  beliefs  (qn)^=i  are  both  stochastic  processes,  and  it  is  important  to  understand 
their  long-run  behavior. 

First,  as  is  standard  in  learning  models,14  the  public  beliefs  constitute  a  martingale.10 

Lemma  3  (The  Unconditional  Martingale)  The  public  belief  (qn)  is  a  martingale, 
unconditional  on  the  state  of  the  world. 

Proof:  Individual  n's  action  only  depends  on  the  history  through  £n,  or  equivalently 
qn  =  1/(1  -I-  £n).  Think  of  his  private  belief  as  being  realized  after  this  observation.  Ex 
ante  to  this  realization,  let  a^(gn)  (resp.  a^(qn))  be  the  conditional  probability  that  action 
am  is  taken  in  state  H  (resp.  state  L).  Then  the  conditional  expectation  of  the  next  public 
belief  is 

E[qn+i  |  <?i, . . . ,  qn]  =  E[qn+i  |  qn] 


13Throughout  the  paper,  m  subscripts  will  denote  actions,  and  n  subscripts  individuals. 

14For  instance,  Aghion,  Bolton,  Harris  and  Jullien  (1991)  establish  this  result  for  the  experimentation 
literature. 

15  We  really  ought  to  specify  the  accompanying  sequence  of  cr-algebras  is  the  stochastic  process,  in  order 
to  speak  about  a  martingale;  however,  these  will  be  suppressed  because  they  are  simply  thrones  generated 
by  the  process  itself. 


=  17.    Y    a"Mn),        ,    ..„.,    +(l-g„)    Y.    Qm(<?»' 


>+<.aisi     "'A.-"w'i+«.^tej 


m^M  9nQ^(<7n)  +  (1  ~  9n)o^,(9n) 


This  martingale  describes  the  forecast  of  subsequent  public  beliefs  by  individuals  in  the 
model,  since  they  do  not.  of  course,  know  the  true  state  of  the  world:  Prior  to  receiving 
his  signal,  individual  ns  best  guess  of  the  public  belief  that  will  confront  his  successor  is 
the  current  one.  But  for  our  purposes,  an  unconditional  martingale  tells  us  little  about 
convergence.  For  that,  we  must  condition  on  the  state  of  the  world,  and  it  is  well-known 
that  will  render  the  public  belief  (qn)  a  su6martingale  in  state  H  (and  a  .supermartingale  in 
state  L).  i.e.  E[qn+\  \  H.  q\, . . . ,  qn]  >  qn.  This  will  follow  from  Lemma  4  below.  Essentially, 
the  public  beliefs  are  expected  to  become  weakly  more  focused  on  the  true  state  of  the 
world  —  a  result  much  weaker  than  we  seek. 

Given  that  (qn)  is  expected  to  increase  in  state  H,  we  at  least  find  it  rather  surprising 
(if  easy  to  prove)  that  (£n)  =  ((1  +  qn)/qn)  remains  constant  in  expectation. 

Lemma  4  (The  Conditional  Martingale)  Conditional  on  the  state  of  the  world  H 
(resp.  state  L),  the  likelihood  ratio  (£n)  (resp.  (l/£n))  is  a  martingale. 

Proof:  Given  the  value  of  £n,  the  next  individual  will  take  one  of  the  available  actions, 
depending  on  his  prior  belief.  In  state  H,  action  am  is  taken  with  some  conditional  prob- 
ability Q^(£n),  while  in  state  L,  the  chance  is  a^(£n).  Thus,  the  conditional  expectation 
of  the  likelihood  ratio  in  state  H  is 

E[tn+l    \HJ.l....in}=  E[£n+l    |   H,  in]  =     Y,    *m(4)   In   ^Jf{     =     L    £    a£(/B)     =     £n 

meM  0im\''n)  meM 

0 

Lemma  5    In  state  H  (resp.  state  L),  the  likelihood  ratio  (£n)  (resp.  (l/£n))  converges 
almost  surely  to  a  limiting  stochastic  variable  which  takes  on  values  in  [0,  oc). 
Proof:     This  follows  directly  from  the  Martingale  Convergence  Theorem,  as  found  for 
instance  in  Breiman  (1968),  Theorem  5.14.  0 

That  the  limiting  likelihood  ratio  (if  it  exists)  cannot  place  positive  weight  on  oo  was 
clear  anyway  clear,  for  the  martingale  property  yields  E^^  \  H,  £i]  =  £\  =  1.  This  crucially 
precludes  individuals  (eventually)  being  wholly  mistaken  about  the  true  state  of  the  world. 

In  the  sequel,  our  goals  are  two-fold.  Suppose  the  state  is  H.  First,  we  wish  to  establish 
general  conditions  guaranteeing  that  £n  -»  0,  so  that  all  individuals  with  unconcentrated 
beliefs  eventually  learn  the  true  state  of  the  world.  Whenever  £  is  very  close  to  0,  only 
individuals  with  very  strong  signals  will  take  a  suboptimal  action.  Second,  we  also  wish 
to  prove  that  eventually  all  individuals  will  almost  surely  take  the  optimal  action.  While 
such  a  result  is  perhaps  more  straightforward,  it  does  not  lend  itself  to  the,  more  general 
framework  we  shall  later  consider. 
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More  States  and  Actions 

The  analysis  goes  through  virtually  unchanged  with  a  denumerable  action  space.  Rather 
than  a  finite  partition  of  [0, 1]  in  Lemma  1,  we  get  a  countable  partition,  and  thus  a 
countable  set  of  posterior  belief  thresholds  f.16  In  this  way.  Lemma  2  will  yield  the 
threshold  functions  p  just  as  above.  The  martingale  properties  of  the  model  are  preserved. 
We  can  also  handle  any  finite  number  S  of  states.  Given  pairwise  mutually  absolutely 
continuous  measures  /i5  for  each  state,  we  could  fix  one  reference  state,  and  use  it  to  define 
5—1  likelihood  ratios.  Again,  each  likelihood  ratio  would  be  a  convergent  conditional 
martingale.  The  complication  is  largely  notational,  as  the  optimal  decision  rules  become 
rather  cumbersome.  Rather  than  the  simple  partitioning  of  [0, 1]  into  closed  subintervals. 
we  would  now  have  a  unit  simplex  in  K5_1  sliced  into  closed  convex  polytopes.  We  leave  it 
to  the  reader  to  ponder  the  optimal  notation,  but  we  simply  assert  that  the  above  results 
would  still  obtain. 

Herding,  Cascades,  and  Complete  Learning 

We  are  now  positioned  to  define  some  fundamental  concepts.  We  find  it  best  to  think 
of  all  dynamics  as  occurring  on  two  different  levels.  From  an  observational  point  of  view, 
we  wish  to  use  the  popular  street  language  of  a  herd,  as  adopted  in  Banerjee  (1992).  Say 
that  a  herd  arises  if  for  some  n  all  agents  starting  at  the  nth  choose  the  same  action. 

But  in  the  more  general  framework  with  multiple  types  and  noise  that  we  soon  consider, 
herds  need  not  arise  and  yet  convergence  in  beliefs  may  still  obtain.  For  this  notion  we 
first  appeal  to  BHW's  term  cascade.  We  say  that  a  cascade  arises  if  for  after  some  stage 
n,  supp(F)  C  {pm-\(£n) , Pm{£n)}  for  some  m.  But  even  this  notion  is  not  sufficient  for  our 
purposes.  Adopting  the  terminology  introduced  in  Aghion  et  al.  (1991),  we  shall  call  the 
learning  complete  so  long  as  individuals'  posterior  beliefs  eventually  become  arbitrarily 
focused  on  the  true  state  of  the  world:  that  is,  if  the  interval  (pm-i(4):Pm(4)]  converges 
to  a  set  that  contains  supp(F)  as  n  — >•  oo,  where  action  am  is  optimal.  Otherwise,  if 
posterior  beliefs  do  not  eventually  become  arbitrarily  focused  on  the  true  state  of  the 
world,  then  we  shall  say  that  learning  is  incomplete.  Observe  that  complete  learning  will 
not  imply  a  cascade  with  unbounded  beliefs,  for  they  exist  some  individuals  with  signals 
so  strong  as  to  not  wish  to  ignore  them;  conversely,  if  there  is  a  cascade  on  the  optimal 
action,  then  complete  learning  obtains. 

It  is  easy  to  see  the  equivalence  of  cascades  and  herds.  Indeed,  if  a  cascade  on  action 
am  arises  at  stage  n  in  the  above  sense,  then  by  Lemma  2,  individual  n  + 1  will  (irrespec- 
tive of  the  state)  necessarily  take  action  am;  therefore,  £n+i  =  t-m  and  so  a  cascade  on 
action  am  exists  at  stage  n  +  1.  Thus,  all  private  belief  thresholds  are  unchanged  by  (2), 
and  supp(F)  C  (pm-i(4+i),pm(Cn)]  too.  The  original  intuition  of  BHW  or  Banerjee 
(1992)  obtains:  Each  individual  takes  an  action  which  does  not  reveal  any  of  his  private 
information,  and  so  the  public  belief  is  unchanged. 


16 This  may  mean  that  we  cannot  necessarily  well  order  the  order  the  belief  thresholds,  nor  as  a  result 
the  actions. 


3.  DISCRETE  DYNAMICAL  SYSTEMS 

Before  tackling  the  main  theorems,  we  shall  step  back  from  the  model,  and  consider  a 
mathematical  abstraction  that  will  encompass  the  later  variations.  The  general  framework 
that  we  introduce  includes,  but  is  not  confined  to,  the  evolution  of  the  likelihood  ratio  (£„) 
over  time  viewed  as  a  stochastic  difference  equation.1' 

The  context  is  as  follows.  Given  is  a  finite  set  M.,  and  functions  ip(- ,  •)  :  .M  x  R+  — >  K_ . 
and  c(-  |  •)  :  M.  x  R+.  — »  [0. 1]  meeting  two  restrictions.  First,  w{-  \  (!)  must  be  a  probability 
measure  for  all  £  €  R^ ,  or 

£   v(m\£)  =  1. 

Second,  the  following  "martingale  property'  must  hold  for  all  £  €  R+: 

Y.   v(m\£)  <p(mj)  =  £  (3) 

Finally,  equip  R+.  =  [0,  oo)  with  the  Borel  cr-algebra  B,  and  define  a  transition  probability 

P  :  Rl  x  B  ->  [0.  1]  as  follows: 

P(LB)  =        Y,       VM?)  (4) 

for  any  B  6  B.  For  our  immediate  application,  one  can  think  of  %j){m\£)  as  the  chance  that 
the  next  agent  takes  action  m  when  faced  with  likelihood  £,  and  <p(m,  £)  as  the  resulting 
continuation  likelihood  ratio. 

Suppose  for  definiteness  that  we  are  given  a  (measurable)  Markov  stochastic  process 
((n)n=i  on  {SlH,£H.v").  where  for  each  n,  £n  :  Q"  ->  R+.  Transition  from  £n  to  £n+l 
is  described  by  the  transition  probability  P.  We  assume  that  E£\  <  oo;  in  applications 
we  shall  always  assume  that  £\  is  identically  1,  so  this  is  not  restrictive.18  Denote  by 
Tn  the  cr-field  in  (QH,£H)  generated  by  (£i,...,£n).  Clearly,  £n  is  jFn-measurable,  and  it 
follows  from  (3)  that  (£n,^n)  is  actually  a  martingale,19  thus  justifying  our  earlier  casual 
description  of  property  (3).  Indeed, 

E[£n+l\£l,...,£n]  =  E[£n+l\£n}=  f    tP(£n,dt)=  £  1>(m\en)<p{m, £n)  =  £n 

Since  (£n)  is  a  martingale  on  R+ ,  we  know  from  the  Martingale  Convergence  Theorem  that 
it  converges  almost  surely  in  R+ .  Denote  the  limiting  stochastic  variable  by  £.  We  now 
characterize  the  limit. 


17  Arthur,  Ermoliev  and  Kaniovski  (1986)  consider  a  stochastic  system  with  a  seemingly  similar  structure 
—  namely,  a  'generalized  urn  scheme'.  Their  approach,  however,  differs  fundamentally  from  ours  insofar 
as  here  it  is  of  importance  not  only  how  many  times  a  given  action  has  occurred,  but  exactly  when  it 
occurred.  But  while  we  cannot  apply  their  results,  we  owe  them  a  debt  of  inspiration. 

18 Notice  that  the  system  has  a  discrete  transition  function;  therefore,  if  ^i  has  a  discrete  distribution 
the  process  will  be  a  discrete  (in  fact,  countably  infinite)  Markov  chain.  One  might  think  that  it  would 
be  possible  to  apply  standard  results  about  the  convergence  of  discrete  Markov  chains,  but  in  fact  such 
results  are  not  useful  here.  While  the  state  space  is  certainly  countable,  all  states  (which  will  soon  be 
interpreted  as  likelihood  functions)  are  in  general  transitory,  and  so  standard  results  are  "Useless. 

l9No  ambiguity  arises  if  we  simply  say  that  (£„)  is  a  martingale. 
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Theorem  1  (Stationarity)  Assume  that  for  all  me  M.  the  two  functions  t  -»  ~{m.  t) 
and  1 1— >  w(m\£)  are  continuous.  Suppose  that  £n  — >  £  almost  surely.  Then  for  all  m  e  M 
and  for  all  £  6  supp(^),  stationarity  obtains,  i.e. 

u>{m\£)  >  0   =>   <p(m,l)  =£  (5) 

As  this  theorem  will  follow  from  the  next  theorem,  its  proof  is  deferred.  That  implication 
(5)  is  truly  a  stationarity  condition  is  best  seen  —  by  means  of  (4)  —  in  its  alternative 
formulation  P{L  {£})  =  1. 

The  intuition  behind  Theorem  1  is  rather  simple.  Since  £n  converges  almost  surely  to 
('.  it  also  converges  weakly  (in  distribution)  to  £.  As  the  process  is  also  a  Markov  chain. 
it  is  intuitive  that  the  limiting  distribution  is  invariant  for  the  transition  P.  as  described 
in  Futia  (1982).  In  fact,  we  can  prove  Theorem  1  along  these  lines,  but  the  continuity 
assumptions  are  subtly  hard-wired  into  the  final  stage  of  the  argument  to  prove  that  the 
limiting  distribution  is  invariant.  As  we  wish  to  do  away  with  continuity,  we  establish  an 
even  stronger  result.  Motivated  by  the  fact  that  (5)  is  violated  for  m  exactly  when  neither 
t'(m\£)  nor  ip(m,£)  —  £  is  zero,  we  have 

Theorem  2  (Generalized  Stationarity)  Assume  that  the  open  interval  I  C  R+  has 
the  property 

3e  >  0  W  e  /  3m  €  M  :   il>{m\i)  >  5,  \<p{m,  £)  -  £\  >  e  (•) 

Then  I  cannot  contain  any  point  from  the  support  of  the  limit,  £. 

Proof:  Let  /  be  an  arbitrary  open  interval  satisfying  (•)  for  e  >  0,  and  suppose  by  way 
of  contradiction  that  there  exists  I  6  /  n  supp(^).  Let  J  =  {£  -  e/2,£  +  e/2)  n  /.  By 
(*).  for  all  £  €  ./,  there  exists  m  6  M.  such  that  ip(m\£)  >  s,  and  <^{m,£)  &  J.  Because 
I  6  supp(£).  there  is  positive  probability  that  £n  €  J  eventually.  But  whenever  £n  €  •/. 
there  is  a  probability  of  at  least  e  that  £n+\  &  J ■  Since  (£n)  is  a  Markov  process,  the  events 
{£n  6  J,  £n+\  £  J}^=i  are  independent.  Thus,  by  the  (second)  Borel-Cantelli  Lemma,  the 
process  (£n)  must  almost  surely  leave  J  infinitely  often  —  contradicting  the  claim  that 
with  positive  probability  it  is  eventually  in  J.  Hence,  £  cannot  exist.  <C> 

Corollary  Assume  that  £  6  supp(£).  Then  for  each  m  €  M,  either  £  •->  <p(m,  £)  or 
t  >— >  w{m\  £)  is  discontinuous  at  £,  or  the  stationarity  condition  (5)  obtains. 
Proof:  If  there  is  an  m  such  that  £  does  not  satisfy  (5)  and  both  £  h->  ^(m,  £)  and 
f  •->  v(m\£)  are  continuous,  then  there  is  an  open  interval  /  around  £  in  which  ip{m\£) 
and  iy?(m,  £)  —  £  are  both  bounded  away  from  0.  This  implies  that  (•)  obtains,  and  so 
Theorem  2  yields  an  immediate  contradiction.  0 

Finally,  it  is  obvious  that  Corollary   implies  Theorem  1. 

More  States  and  Actions 

Once  again,  we  could  easily  have  handled  a  countable  action  space  M,  as  the  finiteness 
of  M  was  never  used.  Also,  given  any  finite  number  5  >  2  states  of  the  world,  all  results 
still  obtain.  For  (£n)  would  then  be  a  stochastic  process  in  R5_1,  and  we  need  only  refer 
to  the  open  intervals  /  and  J  in  Theorem  2  (and  its  proof)  as  open  balls. 
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4.   THE  MAIN  RESULTS 

We  are  now  ready  to  characterize  exactly  when  either  cascades  or  herding  arises.  To  do 
so.  we  shall  recast  the  model  of  section  2  in  the  language  of  section  3.  Fix  the  likelihood 
ratio  £,  and  assume  WLOG  that  the  state  is  H.  By  Lemma  2.  the  individual  takes  action 
am  exactly  when  his  private  belief  is  in  the  interval  {pm-i(£),pm{£)].  Since  this  occurs  with 
chance  FH(pm{£))  -  FH(pm_l{£))  in  state  H.  and  with  chance  FL{pm{£))  -  FL(pm_[{£)) 
in  state  L.  we  have 

v(m\n  =  FH(pm(£))-FH(pm^(())  (6) 

FL(Pm(n)  ~  F^p^ii)) 


:(m,  £)  =  £ 


F«{pm(i))-F"(pm-l(e)) 


in  the  notation  of  section  3. 

We  know  from  Lemma  5  that  £oc  =  limn-,.^  £n  almost  surely  exists.  We  now  apply 
Theorem  2  to  get  a  precise  characterization  of  the  limiting  stochastic  variable.  Recall  that 
co(supp(F))  =  [6,6].  The  crucial  question  is  whether  the  individuals  can  have  arbitrarily 
informative  private  signals  or  not,  i.e.  whether  [6,6]  =  [0, 1]  or  [6,6]  C  (0. 1). 

4.1   Bounded  Beliefs 

Assume  that  the  private  beliefs  are  bounded.  Our  approach  is  two-fold.  We  first  exhibit 
"action  absorbing  basins',  each  corresponding  to  an  action  choice,  in  which  all  learning 
stops,  and  individuals  act  irrespective  of  their  signals.  We  then  argue  that  in  fact  the 
dynamics  eventually  almost  surely  end  up  in  one  of  these  basins,  i.e.  that  cascades  must 
occur. 

Lemma  6  (Action  Absorbing  Basins)     There  are  (possibly  empty)  intervals  Jx J^ 

in  [O.oc),  where  Jm  =  {£  |  [pm-i(^)>Pm(0]  Q  SUPP{F)}.  su°b  that  almost  surely  when 
t  G  Jm  the  individual  takes  action  am,  and  the  next  likelihood  ratio  will  still  be  in  Jm. 
Moreover, 

(1)  not  all  intervals  are  empty,  as  J\  =  {£.  oc)  and  Jm  =  [0,  £]  for  some  0  <  £  <  £  <  oc; 

(2)  the  intervals  have  disjoint  interiors,  and  are  in  fact  inversely  ordered  in  the  sense  that 
all  elements  of  Jmj  are  strictly  smaller  than  any  element  of  Jmi  when  ni2  >  m^. 

Proof:  Since  pm{£)  is  increasing  in  m  by  Lemma  2,  \pm-i(£),Pm(£)]  is  an  interval  for  all 
£.  Then  Jm  is  the  closure  all  £  that  fulfill 

ftn-l(0<fi      and      Pm(t)>b  (8) 

Then  disjointness  is  obvious.  Next,  if  Jm  ^  0  then  FH{pm-x{£))  =  0  and  FH{pm{£))  =  1 
for  all  £  €  Jm.  The  individual  will  choose  action  am  a.s.,  and  so  no  updating  occurs; 
therefore,  the  continuation  value  is  a.s.  £,  as  required. 

With  bounded  beliefs,  it  is  clear  that  we  can  always  ensure  one  of  the  inequalities  in  (8) 
for  some  £,  but  simultaneously  attaining  the  two  may  well  be  impossible.  As  Lemma  2 
yields  p0{£)  =  0  and  plU{£)  =  1  for  all  i,  it  follows  that  we  musthave  Jfj  =  [0.£]  and 
Jx  =  [£,  oc),  where  0  <  £  <  £  <  oc  satisfy  pM-\{L)  =  b  and  px{£)  =  b. 
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Finally,  let  m2  >  mi.  with  £\  €  Jmi  and  £2  E  Jm,.  Then 

Pm2-l(^l)  >  Pm,(*i)  >  b  >  6  >  pm2(£2)  >  prn2_i(£2) 

and  so  £2  <  ?i  because  pm2_i  is  strictly  increasing  in  £.  <) 

By  rearranging  an  expression  like  (2),  one  can  show  that  £  satisfies  (8)  precisely  when 


f-^b_+{1-.m     and     h  +  (l-h)^f 


Sl  i  m 


This  can  surely  also  obtain  for  nonextreme  (insurance)  actions,  and  is  less  likely  the  smaller 
is  b.  the  larger  is  6,  and  the  smaller  is  the  interval  [fm_i,fm]. 

Theorem  3  (Cascades)  Assume  the  private  beliefs  are  bounded,  and  let  £„  — >■  L  Then 
i  E  J\  U  •  ■  •  U  J\t  almost  surely. 

Proof:  Suppose  by  way  of  contradiction  that  £n  —>  £  ^  J\  U  •  ■  ■  U  Jm  with  positive 
probability.  Assume  WLOG  the  state  is  H.  Then  for  some  m  we  have  0  <  FH(pm(£)-)  <  1, 
so  that  individuals  will  strictly  prefer  to  choose  action  am  for  some  private  beliefs  and  am^ 
for  others.  Consequently,  pm{£)  >  6,  and  since  po{£)  =  0  <  6,  the  least  such  m  satisfying 
Pm(£)  >  b  is  well-defined.  So  we  may  assume  FH(pm-i(£)-)  =  0. 

Next.  FH (pm{£))  >  0  in  a  neighborhood  of  I.  There  are  two  possibilities: 

Case  1.  FH(pm(l))  >  FH(pm^(l)). 
Here,  there  will  be  a  neighborhood  around  £  where  FH(pm(P))  —  FH(pm-\(l))  >  e  for  some 
s  >  0.  We  see  from  (6)  that  in  this  neighborhood  il>(m\£)  is  bounded  away  from  0,  while 
(7)  reduces  to  ip(m..£)  =  £FL{pm{£))/ FH (pm(£)),  which  is  also  bounded  away  from  I  for  i 
in  a  neighborhood  oil.  Indeed,  pm(£)  is  in  the  interior  of  co(supp(F)),  and  so  Lemma  A. 2 
guarantees  us  that  FL(pm(£))  is  bounded  above  and  away  from  FH(pm{£))  for  £  near  £ 
(recall  that  pm  is  continuous).  By  Theorem  2,  £  E  s\ipp(£)  therefore  cannot  occur. 

Case  2.  Fff(pm(£))  =  FH(pm.l(£)). 
This  can  only  occur  if  FH  has  an  atom  at  pm_x(£)  =  6,  and  places  no  weight  on  (b,pm(£)\. 
It  follows  from  F^{pm-i{£)-)  =  0  and  pm_2  <  pm_i,  that  FH(pm-2{t))  =  0  for  all  £  in  a 
neighborhood  of  £.  Therefore,  ip(m  -  1\£)  and  <p(m  -  1,£)  -  £  are  bounded  away  from  0 
on  an  interval  [£,£  +  tj),  for  some  rj  >  0.  On  the  other  hand,  the  choice  of  m  ensures  that 
v(m\£)  and  <p{m,  £)  —  £  are  bounded  away  from  0  on  an  interval  (£  —  n',  £],  for  some  n'  >  0. 
So,  once  again  Theorem  2  (observe  the  order  of  the  quantifiers!)  proves  that  £  £  supp(£).  <> 

Theorem  4  (Herds)  Assume  the  private  beliefs  are  bounded.  Then  a  herd  on  some 
action  will  almost  surely  arise  in  finite  time.  Absent  extreme  belief  thresholds  fx  and  ?m. 
the  herd  can  arise  on  an  action  other  than  the  most  profitable  one. 

Proof:  First  note  that  if  rL  >  b  (resp.  fM  <  b)  then  the  first  and  thus  all  subsequent 
individuals  a.s.  ignore  their  private  signals  and  take  action  a^  (resp.  aM).  Now  suppose 
this  does  not  occur,  and  assume  WLOG  the  state  is  H.  Whenever  £  £  [0,£],  we  know  that 
FH{pM-l(£))  >  0  so  that  some  action  other  than  aM  is  taken  with  positive  probability. 

Claim  1:  With  positive  chance,  £n  €  JM  in  a  fixed  finite  number  of  steps. 
Consider  the  following  Tastest  ascent'  of  the  likelihood  ratio.  Suppose  that  whenever  two  or 
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more  actions  can  be  taken  with  positive  probability,  private  beliefs  are  such  that  the  lowest 
numbered  action  is  taken.  This  will  have  the  effect  of  pushing  the  public  belief  toward  state 
L.  Then  the  likelihood  will  evolve  according  to  in+l  =  £nFL{pm{£n))/ FH{pm(£n))  >  £n. 
But  this  can  happen  only  a  finite  number  of  times  before  £n  >  I.  This  follows  from 
Lemma  A. 3.  and  the  fact  that  £n  e  [£,!],  and  so  pm{£n)  G  \pm{£),pm{£))  C  (0, 1).  Indeed, 
we  have 

4+1   =  enFL(Pm(£n))/FH(Pm(£n))   >  en(l-pm(tn))/Pm(in), 

which  proves  that  the  step  size  is  bounded  below.  So,  if  the  likelihood  ratio  does  not  start 
in  [0.  £]  then  it  ends  up  there  with  a  probability  strictly  less  than  1. 

Claim  2:  f.„  e  J\  U  •  ■  •  u  J\f  almost  surely  in  finite  time. 
Because  (£n)  is  not  simply  a  finite  state  Markov  chain,  Theorem  3  does  not  immediately 
imply  convergence  in  finite  time  —  for  we  could  conceivably  have  tn  — >  Jy  U  •  •  •  U  J.vr  but 
£„  $.  J\  U  •  •  •  U  J\i  for  all  n.  But  in  fact  this  cannot  occur,  because  if  the  convergence  took 
an  infinite  number  of  steps,  then  the  (second)  Borel-Cantelli  Lemma  would  imply  that  the 
upcrossing'  of  Claim  1  would  happen  sooner  or  later,  as  the  events  {£n+\  >  ^n}^=\  are 
independent,  conditional  on  ~  ( Jx  U  •  •  ■  U  Jm)  and  on  the  state  of  the  world  H .  (} 

So  the  bottom  line  is  that  all  individuals  eventually  stop  paying  heed  to  their  private  sig- 
nals, at  which  point  the  herd  begins.  Furthermore,  herds  are  either  'correct'  or  'incorrect', 
and  arise  precisely  because  it  is  common  knoledge  that  there  are  no  private  beliefs  strong 
enough  to  overturn  the  public  belief.  This  is  essentially  the  major  pathological  learning 
result  obtained  by  Banerjee  (1992)  and  BHW,  albeit  extended  to  M  >  2  actions.20 

While  we  do  not  assert  how  fast  the  convergence  occurs,  it  is  easy  to  see  that  for  £ 
outside  J\  U  •  ■  •  U  Jm,  (log(£n))  follows  a  random  walk,  albeit  on  a  countable  state  space. 
Still,  with  absorbing  barriers  after  a  fixed  number  of  the  same  parity  jump,  results  in 
Billingsley  (1986).  pp.  128-130,  will  imply  that  convergence  must  be  exponentially  fast. 

4.2  Unbounded  Beliefs 

Next  we  present  the  counterpart  to  Theorem  3  that  was  not  considered  in  Banerjee 
(1992)  and  BHW.  Strictly  bounded  beliefs  turns  out  to  have  been  the  mainstay  for  their 
striking  pathological  herding  results.  Then 

Theorem  5  (Complete  Learning)  //  the  private  beliefs  are  unbounded  then  almost 
surely  £n  —¥  0  in  state  H,  and  £n  — >  oo  in  state  L. 

Proof:  As  usual,  let  I  denote  the  limit  of  (£n)  and  assume  WLOG  the  state  is  H.  As 
Lemma  5  tells  us  that  supp(£)  6  [0,  oo),  it  suffices  to  prove  that 

Claim:  supp(£)  n  (1/JV,  N)  =  0  for  any  natural  number  N  >  1. 
Let  IN  =  (1/N,N).  First  note  that  with  unbounded  private  beliefs,  i'(l\£)  =  FH{px{£))  is 
bounded  away  from  0  on  IN.  Next  recall  that  by  Lemma  A. 2,  FL{r)  -  FH(r)  is  increasing 
on  [0. 1/2)  and  decreasing  on  (1/2, 1];  therefore,  FL{p[{£))  -  FH(p1(£))  is  bounded  away 

from  0  on  /,v,  as  is 

'FL{p,{£)) 


<e(i,i)-t  =  e 


F»(Mt)) 


- 1 


J0The  analysis  of  BHW  also  handled  more  states.  We  soon  address  this  generalization. 
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It  then  follows  from  Theorem  2  that  I\  does  not  contain  any  point  from  supp(£).  Finally, 
let  .V  — >  cc  to  prove  that  supp(0  =0.  0 

So  if  beliefs  are  unbounded,  then  eventually  everyone  becomes  "convinced'  that  of  the 
true  state  of  the  world.  That  is,  it  becomes  ever  harder  for  a  partially  revealing  private 
signal  to  induce  an  individual  to  take  any  other  action  than  the  optimal  one.  Crucially 
observe  that  belief  cascades  cannot  possibly  arise  with  unbounded  beliefs.  This  follows  both 
from  the  earlier  definition,  and  the  fact  that  ignoring  arbitrarily  focused  private  signals 
cannot  possibly  be  an  optimal  policy.  Consequently,  we  cannot  (yet)  preclude  that  an 
infinite  .subsequence  of  individuals  may  get  a  string  of  sufficiently  perverse  signals  to  lead 
each  to  take  a  suboptimal  action. 

It  is  noteworthy  that  whenever  an  individual  takes  a  contrary  action,  subsequent  indi- 
viduals have  no  choice  but  to  conclude  that  his  signal  was  very  strong,  and  this  is  reflected 
in  a  draconian  revision  of  the  public  belief.  We  say  that  the  herd  has  been  overturned  by 
the  unexpected  action.  We  shall  more  carefully  formulate  this  as  the  overturning  ■principle. 
as  it  proves  central  to  an  understanding  of  the  observational  learning  paradigm.  Assume 
individual  n  chooses  action  am.  Then  individual  n+1  should,  before  he  gets  his  own  private 
signal,  find  it  optimal  to  choose  action  am  because  he  knows  no  more  than  individual  n, 
and  because  it  is  common  knowledge  that  n  rationally  chose  am.  So,  the  likelihood  ratio 
after  individual  n's  action,  £(h,  am),  satisfies 

irH(h,am)  = 6  (fm_i,fm], 

1  +  t(n,  am) 

which  is  the  content  of  the  next  lemma. 

Lemma  7  (The  Overturning  Principle)  For  any  history  h,  if  an  individual  optimally 
takes  action  am,  then  the  updated  likelihood  ratio  must  satisfy 


i(h,am)  € 


fm  fm-l       J 


The  proof  is  found  in  Appendix  B. 

Together  with  Theorem  5,  the  overturning  principle  implies  that  herds  in  fact  do  occur 
—  but  only  of  the  nonpathological  variety. 

Theorem  6  (Correct  Herds)  //  the  private  beliefs  are  unbounded,  then  almost  surely 
all  individuals  eventually  take  action  the  optimal  action. 

Proof:  Assume  WLOG  that  the  state  is  H,  so  that  aM  is  optimal.  Theorem  5  asserts  that 
f.n  ->  0  a.s.,  and  so  £n  is  eventually  in  the  neighborhood  [0,  1T^~1)  of  0.  But  by  Lemma  7, 
whenever  any  other  action  than  action  aM  is  taken,  we  exit  that  neighborhood.  0 

More  States  and  Actions 

The  convergence  result  Theorems  3  and  5  do  not  depend  on  the  action  space  being 
denumerable.  In  the  proof  of  Theorem  3,  a  technical  complication  arises,  as  our  choice  of 
the  least  m  such  that  pm(£)  >  b  was  well-defined  because  there  were  only  finitely  many- 
actions.  Otherwise,  we  could  instead  just  pick  m  so  that  pm  is  close  enough  to  b  such  that 
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all  the  "bounded  away"  assertions  hold.  Similarly,  in  the  proof  of  Theorem  5.  we  could 
substitute  a  minimum  action  threshold  pi  by  one  that  is  arbitrarily  close  to  0. 

Complications  are  more  insidious  when  it  comes  to  Theorems  4  and  6.  First  note  that 
with  XI  =  oc,  both  results  still  obtain  without  any  qualifications  provided  a  unique  action 
is  optimal  for  posteriors  sufficiently  close  to  0  and  1,  for  then  the  overturning  principle  is 
still  valid  near  the  extreme  actions.  But  otherwise,  we  must  change  our  tune.  For  instance, 
with  Theorem  6,  there  may  exist  an  infinite  sequence  of  distinct  optimal  'insurance'  action 
choices  made  such  that  the  likelihood  ratio  nonetheless  converges.  This  obviously  requires 
that  the  optimality  intervals  Im  shrink  to  a  point,  which  robs  the  overturning  argument  of 
its  strength.  Yet  this  is  not  a  serious  nonrobustness  critique,  because  the  payoff  functions 
of  the  actions  taken  by  individuals  must  then  converge! 

By  contrast,  incorporating  more  than  two  states  of  the  world  is  rather  simple,  and  the 
modifications  outlined  at  the  end  of  section  2  essentially  apply  here  too. 

5.  NOISE 

We  now  turn  to  the  economic  robustness  of  the  existing  theory,  by  striking  at  its  central 
underpinnings.  The  key  role  played  by  the  overturning  principle  is  in  many  ways  unsettling: 
It  does  not  seem  ' reasonable'  that  such  large  weight  be  afforded  the  observation  of  a  single 
individual's  action.  For  this  reason,  we  first  introduce  noise  into  the  system,  whereby  a 
small  fixed  flow  of  individuals  either  deliberately  (that  is,  they  are  a  'crazy'  type),  or  by 
accident  (i.e.  they  "tremble')  do  not  choose  their  optimal  action.  Consequently,  no  action 
will  have  drastic  effects,  simply  because  the  'unexpected'  is  really  expected  to  happen 
every  now  and  then. 

Two  theses  then  seem  plausible  at  this  point: 

1.  The  statistically  constant  nature  of  noisy  individuals  does  not  jeopardize  the  learning 
process  of  the  rational  informed  individuals  in  the  long  run,  as  it  can  be  filtered  out: 
If  the  likelihood  ratio  has  a  trend  towards  zero  without  the  noise,  that  trend  will  be 
preserved  as  the  underlying  force  even  with  the  additional  noise. 

2.  The  learning  will  be  incomplete,  as  the  stream  of  isolated  crazy  individuals  making 
contrary  choices  will  eventually  be  indistinguishable  from  the  background  noise,  and 
the  public  belief  will  thus  not  tend  to  an  extreme  value. 

We  show  in  this  section  that  in  fact  the  first  intuition  is  correct,  so  that  Theorems  3  and  5 
will  still  hold.  We  then  turn  to  the  more  thorny  issue  of  herding. 

Two  Forms  of  Noise 

Just  to  be  clear,  we  assume  that  whether  an  individual  is  noisy  is  not  public  information, 
and  is  distributed  independently  across  individuals. 

Craziness.  We  first  posit  the  existence  of  crazy  individuals  in  the  model.  Assume  that 
with  probability  rm,  individual  n  will  always  take  action  am,  regardless  of  history.  To  avoid 
trivialities,  we  assume  a  positive  fraction  r  =  l-£m=i  rm>Q  of 'sane'  individuals.  In  the 
language  of  section  3,  the  dynamics  of  the  likelihood  ratio  in  state  H  are  now  described 
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as  follows: 


V>(m\t)  =  'm  +  r{FH(pm(£))  -  FH(pm_^))}  (10) 

/>rw  +  r[Ft(pm(0)-F£(pw-1(0)] 
rm  +  T{F"(pm(£))-FH(pTn_l(£))} 


■Am,t)  =  c^  - -  ™,    -;rr,>  hd 


Trembling.  In  the  second  manifestation  of  noise,  all  individuals  are  rational,  but  some 
may  -tremble',  in  the  sense  of  Selten  (1975).  In  particular,  individuals  randomly  take  a 
suboptimal  action  with  probability  r(£)  when  the  likelihood  ratio  is  £\  for  simplicity,  assume 
that  in  this  event,  all  other  M  -  1  actions  are  equally  likely.  With  M  =  2.  individuals' 
actions  are  wholly  uninformative  when  r(£)  —  1/2,  so  assume  that  t(£)  is  boundedly 
smaller  than  1/2.  On  the  other  hand,  to  avoid  completely  trivializing  the  noise,  we  further 
insist  that  r(£)  be  bounded  away  from  0.  In  state  H  the  dynamics  are  now 


r(£) 
I  - 

-I      r(0]  [FH(pm(£))  -  F"(pm_x(£))]  +  -^-  [l  -  F»(pm(£))  +  FH{pm.l(i))](12) 


f(m|0  =  [l  -  r(£)}  [FH(pm(f.))  -  FH(pm.l(i))]  +  jj^-  £   [FH (p^))  -  FH ' (p^{t)) 

r(£) 
M  -  1  L 

and 
,  [1  -  t{£)\  [FL(Pm(£))  -  FL(Pm-i(£))}  +  ffr  [l  -  FL{pm(£))  +  FL(pm^(£))} 


[1  -  r(*)]  [F*(ft,W)  "  F"(pm^(£))}  +  {$  [1  -  F*(ft»(0)  +  F«(pm.1(£))i 

'Noise  Traders '.  We  could  imagine  a  third  form  of  noise  whereby  a  fraction  of  individuals 
receive  no  private  signal,  and  therefore  simply  free-ride  off  the  public  information.  These 
are  analogous  to  the  'noise  traders'  that  richly  populate  the  financial  literature.  But  they 
require  no  special  treatment  here,  as  they  are  subsumed  in  the  standard  model  outlined 
in  section  2.  For  if  p,H  and  /j.l  have  a  common  atom  accorded  the  same  probability  under 
each  measure,  then  FH  and  FL  will  each  have  an  atom  at  1/2.  Since  a  noise  trader  is 
precisely  someone  who  has  the  private  belief  equal  to  the  common  prior,  namely  1/2.  all 
results  from  section  4  now  carry  over. 

Asymptotic  Learning 

We  are  now  ready  to  investigate  the  effects  of  noise.  Observe  that  with  bounded  beliefs, 
the  interval  structure  of  the  action  absorbing  basins  Ji,...,Jm  in  [0, oo)  obtains  just  as 
before.  The  mere  existence  of  action  absorbing  basins  deserves  some  commentary.  For  one 
might  intuit  that  the  likelihood  ratio  can  no  longer  settle  down:  Eventually  some  noisy 
individual  will  apppear  and  take  an  action  so  unexpected  as  to  push  the  next  likelihood 
ratio  outside  the  putative  action  absorbing  basin.  The  flaw  in  this  logic  is  that  precisely 
because  the  action  was  unexpected,  the  individual  will  be  adjudged  ex  post  to  have  been 
noisy,  and  his  action  will  thus  be  ignored. 

We  now  show  that  Theorems  3  and  5  go  through  unchanged. 


Theorem  7  (Convergence)  Augment  the  standard  model  by  one  of  the  first  two  types 
of  noise,  and  assume  the_  state  is  H.  Then  (n  -»  (  for  some  random  variable  I.  If  the 
beliefs  are  bounded,  then  I  <E  J\  U  ■  ■  •  U  JM  almost  surely,  while  if  the  beliefs  are  unbounded. 
('.  =  0  almost  surely. 

Proof:  Since  (£„)  is  still  a  martingale  in  state  H,  the  Martingale  Convergence  Theorem 
assures  us  that  £  exists,  and  is  almost  surely  finite.  Let  £  €  supp(£). 

Case  1:  Crazy  Agents. 
Here,  the  transition  dynamics  are  given  by  (10)  and  (11),  and  so  t/'(m|£)  is  bounded  away 
from  0.  since  rm  >  0  by  assumption.  On  the  other  hand. 


[m,£)-£  =  £  t 


FL(pm(£))-FL(Pm-i(£))}  -  [FH(pm(0)  -  FH(pm^U) 


u.-\m\£) 


and  so  ^{m.  £)  —  £  =  0  is  satisfied  under  exactly  the  same  circumstances  as  in  the  proofs 
of  Theorems  3  and  5.  because  r  ^  0.  Some  consideration  reveals  that  those  proofs  go 
through  just  as  before. 

Case  2:  Trembling  Agents. 
As  with  the  first  type  of  noise,  all  actions  are  taken  with  positive  probability,  and  so  v(m\£) 
is  indeed  bounded  away  from  0  by  (12).  We  wish  to  argue  once  more  that  (p(m,  £)  —  £  =  0 
is  satisfied  under  exactly  the  same  conditions  as  in  the  proofs  of  Theorems  3  and  5.   We 
can  then  use  (13)  to  rewrite  <f(m,£)  =  £  as  follows: 

[l-T(t)}FL(pm(e))+1^L  [i  _  FL{pm{t))\  =  [l-T{£)}FH{pm{£))  +  ^-i  [l  -  F"(pm(l))} 

which  is  equivalent  to 

1-rW-     rM 


A/-1 

or  simply  r{£)  =  1  —  1/M.  But  this  violates  the  assumption  t(£)  <  1/2.  Therefore,  the 
proof  of  Theorem  3  obtains  once  again  when  t(£)  <  1/2,  while  *p{l,£)  —  £  is  bounded  away 
on  I^,  and  so  Theorem  5  goes  through  just  as  before  also.  <C> 

The  above  theorem  argues  that  whether  individuals  eventually  learn  the  true  state  of 
the  world  is  surprisingly  unaffected  by  a  small  amount  of  constant  background  noise.  But 
the  corresponding  purely  observational  results  on  herding,  namely  Theorems  4  and  6,  can 
no  longer  obtain  without  modification.  Indeed,  it  is  impossible  that  all  individuals  take 
the  same  action. 

Let's  only  ask  that  all  rational  individuals  take  the  same  action  in  a  herd.  With  such 
a  redefinition,  Theorem  4  is  still  valid:  Herds  will  arise  with  positive  probability  under 
bounded  beliefs,  as  £  must  reach  an  action  absorbing  basin  in  finite  time.  But  as  the 
proof  of  Theorem  6  critically  invokes  the  (now  invalid)  overturning  principle,  we  cannot 
guarantee  that  a  herd  (correct  or  not)  will  almost  surely  arise. 

This  turns  on  the  speed  of  convergence  of  the  public  belief.  For  a  herd  is  tantamount 
to  an  infinite  string  of  individuals  having  private  beliefs  that  are  not  strong  enough  to 
counteract  the  public  belief.  Suppose  that  the  state  is  H,  so  that  we  know^that  qn  ->  1 
a.s.  by  the  previous  theorem.  Then  a  correct  herd  arises  in  finite  time  so  long  as  there  is 
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not  an  infinite  string  of  'herd  violators'  (individuals  with  private  beliefs  below  p.\f{i)).  In 
light  of  the  (first)  Borel-Cantelli  Lemma,  this  occurs  with  zero  chance  provided 

T  FH  {  ^-1n)r.M  \  <  ^ 

n=i         \{l-qn)rM+qn{l-rM)J 

At  the  moment,  we  cannot  determine  whether  this  inequality  occurs  almost  surely  or  even 
with  positive  probability.  But  if  the  public  belief  converges  at.  say.  an  exponential  rate, 
then  because  FH  has  no  atom  at  0  or  1  by  assumption,  the  sum  will  be  finite. 

More  States  and  Actions 

With  a  denumerable  action  space,  the  only  subtlety  that  arises  is  with  the  trembling 
formulation,  where  we  shall  insist  upon  a  finite  support  of  the  tremble  from  any  L  with 
all  those  destination  actions  equilikely. 

With  more  than  two  states  the  arguments  go  through  virtually  unchanged. 

6.   MULTIPLE  INDIVIDUAL  TYPES 

6.1   Introduction  and  Motivation 

Parallel  to  the  Experimentation  Literature 

The  results  so  far  bear  some  similarity  to  the  stylized  predictions  of  the  single-person 
learning  theory,  but  are  analytically  much  simpler.  For  inasmuch  as  individuals  may 
ignore  any  future  ramifications  of  their  actions,  the  resulting  decision  problem  they  solve 
is  trivial  by  comparison.21  And  while  it  is  the  value  of  information  that  sustains  individual 
experimentation,  observational  learning  by  contrast  is  bolstered  solely  by  the  diversity  of 
signals  that  subsequent  individuals  may  entertain.  There  is  therefore  no  need  for  ad  hoc 
and  involved  methods  of  control  theory  that  has  dogged  the  experimentation  literature, 
and  greatly  restricted  its  applicability. 

Recall  first  an  early  result  of  this  genre  due  to  Rothschild  (1974).  He  considered  a  classic 
economic  illustration  of  the  probabilist's  'two-armed  bandit':  An  infinite-lived  impatient 
monopolist  optimally  experiments  with  two  possible  prices  each  period.  Rothschild  showed 
that  the  monopolist  would  (i)  eventually  settle  down  on  one  of  the  prices  almost  surely, 
and  (ii)  with  positive  probability  settle  down  on  the  less  profitable  price.  We  wish  to  draw 
some  parallels  with  our  bounded  support  beliefs  case:  The  first  result  above  corresponds 
to  the  belief  cascades  of  Theorem  3  —  for  in  both  learning  paradigms,  the  likelihood  ratio 
enters  an  action  absorbing  basin,  after  which  future  signals  are  ignored.  The  second  more 
striking  pathological  result  corresponds  to  the  possibility  of  misguided  herds,  as  described 
in  Theorem  4:  simply  put,  there  is  always  one  action  absorbing  basin  that  leads  individuals 
to  adopt  the  most  unprofitable  action. 


21  Even  more  difficult  is  the  marriage  of  the  observational  and  experimental  paradigms.  For  instance, 
while  Smith  (1991)  explicitly  tried  to  avoid  this  problem,  Bolton  and  Harris  (1993)  have  recently  blended 
the  two  paradigms,  to  investigate  the  interplay  between  these  forms  of  learning.  Most  notable  among  their 
findings  is  that  when  long-lived  individuals'  experimentation  is  publicly  observable,  there*  is  an  additional 
dynamic  incentive  to  experiment,  namely  the  desire  to  'encourage'  future  experimentation  by  others. 
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This  analogy  is  not  without  independent  interest,  as  it  foreshadows  our  next  principal 
finding.  For  with  heterogeneous  preferences,  an  interesting  new  twist  is  introduced.  As 
with  the  noise  formulation,  we  assume  that  an  individual's  type  is  his  private  information. 
Yet  everyone  will  be  able  to  extract  information  from  history  by  comparing  the  proportion 
of  individuals  choosing  each  action  with  the  known  frequencies  of  preference  types.  So 
long  as  all  types  do  not  have  the  same  frequency,  this  inference  intuitively  ought  to  be 
fruitful.  Surprisingly,  however,  the  learning  dynamics  may  in  fact  converge  upon  a  wholly 
uninformative  outcome,  in  which  each  action  is  taken  with  the  same  probability  in  all 
states.  We  shall  argue  that  this  'twin  pathology',  which  we  dub  confounded  learning. 
arises  even  with  unbounded  private  beliefs  —  that  is,  even  when  herding  cannot  occur. 
The  essential  requirement  for  confounded  learning  is  that  there  be  at  least  as  many  states 
of  the  world  as  actions.  For  otherwise,  there  will  generically  always  be  an  action  which  is 
not  taken  with  the  same  probability  in  each  state. 

Barring  confounded  learning,  a  'herd'  may  arise:  By  this,  we  now  mean  that  everyone  of 
the  same  preference  type  will  take  the  same  action.  Provided  some  types'  vNM  preferences 
are  not  identical,  the  overturning  argument  will  (sometimes)  fail  here  just  as  it  did  with 
noise:  Unexpected  actions  need  not  radically  affect  beliefs,  because  the  successors  will  also 
entertain  the  hypothesis  that  the  individual  was  simply  of  a  different  type. 

A  Simple  Example  of  Confounded  Learning 

There  are  several  issues  we  wish  to  investigate,  and  thus  find  it  most  convenient  to  just 
consider  the  simplest  possible  specification  of  this  model. 

Assume  that  there  are  M  =  2  actions  and  two  preference  types,  labelled  A  and  B. 
Individuals  are  of  type  .4  with  chance  ta,  where  the  preferences  of  A  are  just  as  before; 
namely,  in  state  H  action  a2  is  preferred  over  a,\,  and  conversely  so  in  state  L;  type  .4 
individuals  will  be  indifferent  when  their  private  belief  equals  pf(^).  Type  B  individuals 
have  the  opposite  preferences,  preferring  a\  to  o2  in  state  H,  and  conversely  in  state  L. 
and  having  the  private  belief  threshold  pf(£).  If  we  assume  WLOG  that  the  state  is  H. 
then  the  dynamics  are  described  by 

0(110  =  rAFH(pt (£))  +  (1  -  rA)  [l  -  FH(pf(i))}  (14a) 

w(2\£)  =  rA[l-  F"(pt((>))}  +  (1  -  TA)FH(p*  (£))  (14b) 


and 


MTAFL{p?(t))  +  {l-TA)[l-FLlp?W)] 

*(i.  n  -  eTAFH{PA{i))  +  (i  _  Ta)  [i  _  F«{PB m  i oa) 


J  A 


.  ,[l-FL(pt(l))\+(l-TA)FL{p?(l)) 
'P(2'   }  "    rA  [1  -  F"(pf(l))}  +  (1  -  TA)F»(rf(l)) 

For  now,  let  us  sidestep  belief  cascades  as  the  source  of  incomplete  learning,  and  simply 
assume  that  private  beliefs  are  unbounded.  Together  with  (14a)  and  (14b),  this  implies 
that  0(1|£)  and  0(2|£)  are  bounded  away  from  0  for  all  t.  It  is  also  elementary  to  verify 
that  given  (15a)  and  (15b),  the  two  stationarity  conditions  <p{\,£)  =  £  and  p(2 J)  =  I 
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reduce  to  one  and  the  same  requirement:  that  one  action  (and  thus  the  other)  is  be  taken 
with  equal  chance  in  each  of  the  two  states,  or 

TAFL(pt (£))  +  (1  -  rA)  [l  -  FL(p? (C)j\  =  rAFH(pt (£))  +  (1  -  rA)  [l  -  FH(p? (£))]    (16) 

Intuitively,  this  asserts  that  no  action  reveals  anything  about  the  true  state  of  the  world, 
and  therefore  individuals  simply  ignore  history:  the  private  belief  thresholds  of  the  two 
types  are  precisely  balanced  so  as  to  prevent  successive  individuals  from  inferring  anything 
from  history.  We  shall  say  that  a  solution  t  to  (16)  is  a  confounded  learning  outcome,  if 
the  inferiority  condition  FH{p^(f:))  £  (0. 1)  additionally  holds.22  This  simply  excludes  the 
degenerate  non-interior  solutions  to  (16)  in  which  the  beliefs  are  so  strong  and  perverse 
that  both  types  are  wholly  convinced  of  their  beliefs.  Since  a  confounded  learning  outcome 
will  almost  surely  not  arise  in  finite  time,  we  shall  say  that  confounded  learning  obtains  if 
(n  — >  f.  where  f  is  a  confounded  learning  outcome. 

Observe  the  following  nice  distinction  between  a  belief  cascade  and  confounded  learning. 
In  a  cascade,  individuals  disregard  their  own  private  information  and  are  wholly  guided 
by  the  public  belief.  Conversely,  with  confounded  learning,  individuals  (in  the  limit) 
disregard  the  public  information  and  rely  solely  on  their  private  signal.  But  while  cascades 
and  confounded  learning  really  are  different  phenomena,  both  are  pathological  outcomes: 
Social  learning  stops  short  of  an  arbitrarily  focused  belief  on  the  true  state  of  the  world. 
In  the  first  case,  the  termination  is  abrupt,  while  in  the  second,  learning  slowly  dies  out. 

Have  we  catalogued  all  possible  learning  pathologies?  One  might  also  imagine  an  alto- 
gether different  conclusion  of  the  learning  dynamics.  Indeed,  since  we  have  a  difference 
equation  with  an  interior  stable  point,  there  might  perchance  also  exist  a  stable  cycle,  i.e. 
a  finite  set  of  at  least  two  non-stable  points  such  that  the  process  once  in  the  cycle  would 
stay  within  the  cycle.  But  we  know  that  such  a  stochastic  steady  state  cannot  possibly 
occur  because  the  likelihood  ratio  is  known  to  converge:  That  (f.n)  is  a  martingale  in 
addition  to  a  markov  process  is  truly  a  useful  property! 

The  Experimentation  Literature  Revisited 

We  now  return  to  our  earlier  analogy  to  the  experimentation  literature.  An  interest- 
ing sequel  to  Rothschild  (1974)  was  McLennan  (1984),  who  permitted  the  monopolist  the 
flexibility  to  charge  one  of  a  continuum  of  prices;  he  assumed  for  definiteness  that  the 
demand  curve  was  one  of  two  linear  possibilities,  either  q  =  a  +  bp  or  q  =  A  +  Bp.23  To 
avoid  trivialities,  he  assumed  that  neither  curve  dominated  the  other,  i.e.  they  crossed  at 
some  interior  and  feasible  pair  (p,  q).  He  showed  that  under  certain  conditions,  the  optimal 
price  may  well  converge  to  p,  at  which  point,  no  further  learning  occurs.  Intuitively,  this 
corresponds  to  confounded  learning  in  the  observational  learning  model.  The  likelihood 
ratio  is  tending  to  an  isolated  stationary  point  outside  the  action  absorbing  basins.  Fur- 
thermore, it  could  only  arise  because  the  action  space  was  continuous,  and  thus  the  level 
of  experimentation  (in  the  sense  of  charging  an  informative  price)  could  slowly  peter  out. 


—It  is  easily  verified  that  when  i  is  a  confounded  learning  outcome  if  FH{p?{t))    €    (0,1)  then 
FH{PiW)  €  (0,1)  likewise. 
'-'Here,  p  is  the  price  and  q  is  the  probability  that  this  period's  consumer  buys,  b, B  <  8. 
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We  shall  comment  on  the  possibility  of  using  some  of  McLennan's  insights  in  the  dis- 
cussion and  example  later  in  this  section.  But  by  our  earlier  remarks,  we  may  adduce 
one  implication  already  for  single  person  learning  theory.  Since  the  likelihood  ratio  clearly 
must  also  constitute  a  conditional  martingale  in  that  paradigm  too.  McLennan"s  paper 
captured  all  possible  pathological  outcomes  of  a  pure  experimentation  model. 

6.2   Towards  a  Theory 

We  are  now  confronted  with  some  key  questions: 

1.  Must  confounded  learning  outcomes  exist  in  our  model?  Are  they  unique? 

2.  Even  if  a  confounded  learning  outcome  i  exists,  does  confounded  learning  actually 
obtain  (i.e.  f.n  — >  f)? 

3.  With  unbounded  beliefs,  is  there  still  a  positive  probability  of  complete  learning,  i.e. 
C.n  — >  0  in  state  H  and  I n  — »  oc  in  state  LP.   If  so,  do  correct  herds  arise? 

For  now.  we  can  only  offer  partial  answers  to  these  questions.  For  the  search  for  an- 
swers shall  carry  us  into  relatively  uncharted  territory  on  the  local  and  global  stability  of 
stochastic  difference  equations. 

Theorem  8  (Confounded  Learning)     Assume  there  is  more  than  one  preference  type. 

(1)  Suppose  that  there  are  at  least  as  many  states  of  the  world  as  actions,  and  that  FH  and 
FL  are  not  entirely  discrete  distributions.  Then  confounded  learning  outcomes  generically 
may  exist,  and  when  they  do,  confounded  learning  obtains  with  positive  probability,  and 
complete  learning  with  chance  less  than  1.  Yet  with  unbounded  beliefs  incorrect  herds 
almost  surely  do  not  arise. 

(2)  //  there  are  more  actions  than  states,  or  if  FH  and  FL  are  discrete  distributions,  then 
generically  no  confounded  learning  outcome  exists,  and  learning  is  almost  surely  complete. 

Proof:     We  focus  first  on  the  simple  case  of  two  states,  two  actions,  and  two  types,  for 
which  there  are  no  more  actions  than  states.  We  shall  later  argue  that  everything  we  say 
holds  with  more  states,  actions,  and  types,  and  in  particular  we  shall  prove  our  claims 
about  the  number  of  states  and  actions. 
Observe  that  (16)  is  equivalent  to 

FLffi(l))  -  FH  (pt  (Q)       1  -  r 
FL(pf(£))-F»(p?(e)) 

Here  we  can  see  why  a  confounded  learning  outcome  generically  exists  precisely  when  the 
distribution  functions  have  continuous  segments.  Simply  fix  any  payoff  assignment  (and 
by  implication  the  functions  pf  and  pf ),  and  fix  L  Then  calibrate  r  so  that  I  solves  (17). 
Of  course,  this  turns  the  process  on  its  head;  r  should  be  held  fixed  while  £  is  varied. 
But  if  FH  and  FL  vary  continuously  with  p  at  the  given  £,  and  if  the  left  side  of  (17)  is 
not  locally  constant  in  P.  (which  will  be  shown  later),  then  around  the  solution  we  have 
just  computed,  there  will  exist  a  r-neighborhood  in  which  a  confounded  learning  outcome 
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exists.  For  the  partial  converse,  if  FH  and  FL  are  entirely  discrete,  then  the  left  side 
of  (17)  will  only  assume  a  finite  number  of  values,  and  confounded  learning  outcomes  will 
not  be  generic. 

We  shall  not  prove  generally  that  confounded  learning  obtains  with  positive  probability, 
but  rather  shall  rely  upon  a  later  proof-by-example. 

With  unbounded  private  beliefs  incorrect  cascades  cannot  occur  for  the  same  reason  as 
outlined  in  Lemma  5.  Similarly,  incorrect,  herds  cannot  occur.  The  only  candidates  for 
stationary  points  is  £  —  0  (that  is,  when  the  state  is  H),  where  all  individuals  take  the 
correct  action,  and  any  confounded  learning  outcome  where  there  is  no  herd. 

Finally,  with  A/  >  2  actions,  and  any  number  of  preference  types,  the  confounded 
learning  outcome  will  solve  M  —  1  independent  equations  in  one  variable  £  so  that 
nonexistence  will  be  generic.  More  generally,  with  multiple  states  of  the  world,  the  number 
of  likelihood  ratios  will  equal  the  number  of  states  minus  one.  Hence,  Theorems  3  and  5 
can  fail  if  there  are  at  least  as  many  states  as  actions  (i.e.  confounded  learning  may  arise 
with  positive  probability),  while  they  hold  if  there  are  more  actions  than  states.  0 

Let  us  now  discuss  what  is  not  addressed  by  the  above  theorem. 

Consider  the  uniqueness  of  the  confounded  learning  outcome.  Note  that  with  discrete 
distributions  confounded  learning  outcomes  £  are  not  unique  when  they  exist,  for  (as  seen 
in  example  below)  there  will  in  fact  be  an  interval  around  £  of  confounded  learning  out- 
comes simply  because  FH  and  FL  are  locally  constant.  Whether  the  confounded  learning 
outcomes  are  unique  modulo  this  exemption  remains  to  be  seen. 

We  now  touch  on  the  issue  of  whether  learning  in  complete  with  at  least  as  many  states 
as  actions.  As  complete  learning  and  confounded  learning  are  mutually  exclusive  events,  we 
cannot  yet  prove  that  there  is  even  a  positive  probability  of  the  former.  If  the  confounded 
learning  outcome  is  very  close  to  the  initial  belief  I  =  1,  it  is  not  at  all  implausible  that 
it  would  attract  all  the  mass.  Also,  with  more  than  two  states  of  the  world,  the  dynamics 
are  multidimensional,  and  these  concerns  become  even  more  difficult  to  address. 

Finally,  we  have  until  now  focused  on  the  learning  dynamics  rather  than  on  the  action 
choices.  With  bounded  beliefs,  we  actually  cannot  be  too  ambitious  in  our  assertions.  For 
instance,  in  the  trivial  case  above  where  both  types  eventually  choose  the  same  action,  any 
limiting  likelihood  ratio  £  satisfies  FH{p*{£))  =  1  -  FM(pf(£)).  The  overturning  argument 
will  hold  and  so  a  herd  must  arise  in  finite  time.  But  if  the  two  types  take  different  actions 
in  the  limit,  whether  a  herd  arises  is  uncertain. 

The  Example  Revisited:  Local  Stability 

We  now  show  that  a  confounded  learning  outcome  will  "attract  mass"  locally  (that  is, 
nearby  likelihood  ratios  tend  there  with  positive  probability),  in  our  simple  model  with 
two  states,  two  actions,  and  two  types. 

If  density  functions  fH  and  fL  exist,  then  y?(l,  £)  and  <^(2,  £)  are  differentiate  in  I.  It  is 
then  fairly  straightforward  to  see  that  if  t  satisfies  the  stationarity  criterion  (17),  we  get 

t/.(i|r)^(i,n  +  ^(2|r)^(2,r)  =  i. 

Or.  in  words,  near  the  confounded  learning  outcome  the  expected  next  £  is  approximately 
the  current  £.  This  turns  out  (see  Appendix  C)  to  be  the  crucial  ingredient  for  the  local 
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stability  of  the  confounded  learning  outcome. 

In  the  specific  example  that  we  consider.  p.H{a)  =  (1  -  \fa)j\fa  while  /jL{a)  =  1  for 
a  G  (0, 1).  It  is  possible  to  then  deduce  quite  simply  that  FH(p)  =  p2  and  FL{p)  =  2p-p2. 
Suppose  that  action  a2  is  the  default  option  of  no  investment,  yielding  a  payoff  equal 
to  0  with  certainty,  while  action  ay  is  an  investment  that  is  profitable  in  state  H  and 
unprofitable  in  state  L,  with  the  following  payoffs: 


Payoff  of  ay 

State  H 

State  L 

Type  A 

u 

-1 

Type  B 

-1 

V 

where  u.  v  >  0. 

It  is  now  easy  to  calculate  the  belief  thresholds  p*{£)  =  £/(u  +  £)  and  pf(£)  =  v£/(\  +  v£). 
We  wish  to  show  that  there  exists  a  unique  confounded  learning  outcome  I  that  is  locally 
stable  in  a  certain  weak  stochastic  sense.  We  shall  apply  Proposition  C.l  of  Appendix  C 
to  prove  the  stability  claim.  Observe  that 

FHP-Ht))-F"(p?(£))       u(l  +  v£)2  _ 
FL(p?(e))-F"(p?(£))       v{u  +  £)2~m} 

The  behavior  of  the  function  rj  is  now  critical.  Let's  ignore  the  degenerate  case  uv  =  1 
in  which  r\  =  1;  for  then  a  confounded  learning  outcome  only  exists  if  r  =  1/2.  which 
is  the  trivial  case  in  which  no  inference  from  history  is  ever  possible.  It  is  fairly  easy  to 
see  that  r\  is  strictly  monotone  if  uv  ^  1.  Assume  for  definiteness  uv  <  1,  so  that  at 
most  one  confounded  learning  outcome  exists.  In  fact,  as  the  range  of  r\  is  (uv,  l/uv),  a 
confounded  learning  outcome  £  definitely  exists  for  all  r  close  enough  to  1/2.  It  is  now  a 
simple  algebraic  exercise  to  confirm  that  <pi(l,£)  =  iPe(2.£)  >  0,  and  so  Proposition  C.l 
now  applies:  There  is  a  neighborhood  around  £  such  that  if  £n  is  in  that  neighborhood, 
then  there  is  a  positive  chance  that  £n  — >  £. 

Global  Stability? 

To  be  sure,  we  would  prefer  a  "global  stability"  result.24  Because  a  confounded  learning 
outcome  locally  attracts  mass,  there  are  non-trivial  specifications  under  which  it  attracts 
mass  globally.  Even  if  a  stationary  point  is  locally  stable  we  must  ensure  that  the  dynamics 
can  actually  enter  the  stable  region  from  outside:  This  is  by  no  means  trivial,  as  the 
stochastic  process  might  oscillate  across  the  stable  region  without  ever  landing  there.  But 
global  properties  of  dynamical  systems  are  in  general  notoriously  hard  to  deduce,  so  that 
one  ought  to  expect  little  progress  on  this  latter  front.  Progress  here  is  intertwined  with  a 
comprehension  of  the  complicated  asymptotic  properties  of  stochastic  difference  equations. 


24To  the  best  of  our  knowledge,  and  much  to  our  surprise,  stability  theory  for  stochastic  difference  equa- 
tions really  does  not  exist,  and  so  we  are  coining  terms  here.  We  call  a  fixed  point  y  of  a  stochastic  difference 
equation  locally  stable  if  Pr(limn_too  y„  =  y)  >  0  whenever  y0  €  .Vs,  a  small  enough  neighborhood  about 
y.  If  Prtlimn.+oo  yn  -  y)  >  0  for  all  yQ,  then  y  is  globally  stable.  Finally,  if  for  every  neighborhood  A's  of 
y.  there  is  a  smaller  neighborhood  ,V5*  of  y  in  A's  such  that  ?x{yn  €  A's  Vn  |  y0  G  Ag )  >  0,  then  we  simply 
call  y  stable.  One  might  also  wish  to  preface  each  of  these  labels  by  'almost  sure'  if  the  probabilities  are 
1,  but  we  shall  not  have  occasion  to  make  such  assertions. 
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Yet  McLennan  managed  to  establish  global  stability  for  certain  parameter  specifications 
in  his  model.  In  a  nutshell,  his  basic  idea  was  to  argue  that  whenever  Cn  is  on  one  side  of 
the  confounded  learning  outcome,  then  £n+1  must  be  on  the  same  side.  We  unfortunately 
can  find  no  reason  to  expect  that  such  a  wonderful  monotonicity  property  obtains  here. 
For  the  example,  it  would  suffice  to  prove  (which  we  cannot)  that  globally  y>*(m,  •)  >  0  for 
m  =  1,2.  For  in  light  of  <p{m,  I)  =  I,  that  would  imply  y{m,  £)£  for  tl. 

Signals  about  Types? 

We  have  maintained  in  this  section  the  working  hypothesis  that  types  were  unobserv- 
able  —  for  with  perfectly  observed  types  the  analysis  of  section  4  applies.  But  consider 
the  middle  ground.  Suppose,  for  simplicity,  that  after  individual  n  moves,  subsequent 
individuals  receive  an  informative  binary  public  signal  as  to  his  type.  Then  the  dynamics 
are  now  modified,  as  there  will  be  four  different  possible  continuations,  namely  one  for 
each  of  the  possible  combinations  of  individual  n's  action  and  of  the  type  signal  value.  A 
confounded  learning  outcome  will  then  have  to  solve  three  independent  equations  rather 
then  one,  and  so  generically  confounded  learning  will  not  arise.  We  anticipate  that  the 
results  of  section  4  will  obtain  here,  with  appropriate  modifications. 

7.   COSTLY  INFORMATION 

We  now  reconsider  the  theory  under  the  reasonable  assumption  that  information  is 
costly  to  acquire.  As  we  shall  see,  it  is  not  only  important  whether  this  means  the  public 
or  the  private  information,  but  also  what  the  exact  timing  of  the  signal  acquisition  is.  One 
might  imagine  variations  on  this  theme  allowing  for  endogenous  information  quality,  but 
we  shall  avoid  such  side  issues.  Overall,  costly  signals  will  make  incomplete  learning,  and 
thus  incorrect  herds,  more  likely. 

1.  Costly  Public  Information 

First,  assume  that  no  information  is  revealed  before  the  signals  are  acquired,  and 
assume  that  the  private  signal  is  free,  while  the  public  information  (the  observation  of  all 
predecessors'  actions)  costs  c.  Then  early  individuals  may  not  buy  the  public  information 
and  thus  will  wholly  follow  their  own  signal,  the  action  history  will  eventually  become  very 
informative.  Sooner  or  later  an  individual  will  find  it  worthwhile  to  buy  the  information. 
As  the  public  information  only  (weakly)  improves  over  time,  all  subsequent  individuals 
will  decide  likewise.  We  have  now  arrived  at  the  old  model,  and  so  the  existing  theory 
obtains. 

Next  suppose  that  the  individuals  can  observe  their  private  signal  before  they  decide 
whether  to  buy  the  public  information25  —  which  they  will  do  whenever  its  option  value 
justifies  its  cost.  But  by  the  previous  logic,  the  public  information  can  only  become  more 
focused  over  time,  and  therefore  it  gets  still  more  attractive,  and  therefore  more  and  more 


25  While  in  the  previous  formulation,  it  did  not  matter  whether  the  decision  to  acquire  information  was 
observable,  it  does  now  (since  this  decision  reveals  something  about  the  individual's  private  signal  too). 
But  this  variation  is  inessential  for  the  qualitative  description  that  follows,  and  so  we  do  npt  further  nuance 
here. 
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individuals  will  decide  to  buy  it  along  the  way.  For  each  n,  everyone  with  private  beliefs  in 
[0n.  1  -  0n]  will  buy  the  signal,  and  others  will  not.  where  the  threshold  0n  ->  1.  So.  with 
bounded  private  beliefs  all  individuals  will  eventually  buy  the  public  information,  and  a 
herd  almost  surely  arises.  With  unbounded  private  beliefs,  the  public  belief  in  state  H 
will  converge  to  1.  and  so  full  learning  obtains. 

2.  Costly  Private  Information 

Now  suppose  that  the  public  information  is  free,  but  the  private  signal  is  at  cost. 
Assume  first,  that  the  purchase  decision  occurs  before  the  public  history  is  observed.  Of 
course,  private  signals  must  be  sufficiently  worthwhile  that  the  first  individual  is  willing  to 
buy  the  private  signal  at  the  cost  c.  By  similar  arguments,  everyone  after  some  individual 
.V  will  not  purchase  private  signals,  and  therefore  the  public  information  will  not  improve. 
A  herd  will  begin,  even  with  unbounded  beliefs!  Interestingly  enough,  N  is  known  ex  ante 

it  is  not  stochastic.  That  fact  stands  in  contrast  to  all  other  herding  results  we  have 
seen  so  far. 

If  individuals  may  first  observe  history  before  buying  their  private  signal,  then  such 
purchases  will  continue  until  the  public  information  reaches  a  certain  threshold.  The 
above  result  obtains,  only  this  time  the  herd  starts  at  a  stochastic  individual. 

3.  Costly  Private  and  Public  Information 

Finally,  consider  the  combination  where  both  public  and  private  information  is  costly. 
It  is  not  hard  to  extrapolate  from  the  above  analysis:  Individuals  will  initially  only  buy 
the  private  signal,  but  not  the  public  one.  After  a  while,  they  will  start  to  find  the  public 
information  attractive,  and  finally  the  public  information  will  become  so  good  that  they 
will  no  longer  buy  the  private  signal.  Of  course,  timing  is  an  issue,  and  so  the  decision  to 
buy  the  public  information  may  or  may  not  be  predicated  on  the  content  of  the  private 
signal.  This  story  of  how  the  information  is  generated  by  the  first  agents,  while  later 
agents  are  free-riding  on  the  public  knowledge,  captures  an  essential  characteristic  of  the 
herding  phenomenon. 

A.  CONSEQUENCES  OF  BAYES  UPDATING 

We  originally  derived  all  results  in  this  Appendix  by  proofs  other  than  the  presented 
ones.26  We  consider  the  proofs  offered  here  easier  and  more  direct.  Inspiration  for  the 
current  formulation  was  found  in  the  exercises  to  Chapter  8  in  DeGroot  (1970). 

The  setup  is  taken  from  section  2.1,  except  that  now  we  assume  that  the  prior  chance 
that  the  state  is  H  is  £  €  (0, 1).  Hence,  the  private  belief  p  e  (0, 1)  satisfies 

(   s  &(g> 

Pia)  =  &(*)  +  (!-«■ 

Lemma  A.l     FH  and  FL  have  the  same  support. 


-6One  may  derive  our  lemmas  through  the  Law  of  Iterated  Expectations,  which  for  us  says  that  applying 
Bayes'  rule  more  than  once  is  redundant.  That  gives  rise  to  a  quite  strong  characterization  of  the  Radon- 
Nikodym  derivative  of  FH  w.r.t.  FL,  but  we  omit  the  details. 
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Proof:  Note  that  p(a)  is  in  the  support  of  FH  (resp.  FL)  exactly  when  a  is  in  the  support 
of  nH  (resp.  fj.L).  But  /j,H  and  p,L  are  mutually  a.c.  and  thus  have  the  same  support.       0 

Lemma  A. 2  The  function  FL(p)  —  FH{p)  is  weakly  increasing  for  p  <  £  and  weakly 
decreasing  for  p  >  £.  Moreover,  FL{p)  >  FH(p)  except  when  FL(p)  =  FH  (p)  =  0  or 
FL(p)  =  FH(p)  =  1. 

Proof:     Observe  that  for  all  a  in  the  support  of  p,H  and  jiL  we  have 

p(o)  <  £  «  g(a)  <  Zg(a)  +  (1  -  0  <=►  g(a)  <  1 

Now.  when  if  Radon-Nikodym  derivative  g  =  dp.H /dy,L  <  1  on  a  set  of  signals,  then  that 
set  is  accorded  smaller  probability  mass  under  FH  than  FL .  So  FL  grows  strictly  faster 
than  FH  on  (0.£)  because  its  derivative  is  larger  when  it  exists,  and  because  it  has  larger 
atoms:  similarly.  FL  grows  strictly  slower  than  FH  on  (f .  1).  Finally,  in  order  that  the 
above  strict  assertions  obtain.  FL  >  FH  necessarily  in  the  interior  of  supp(F).  0 

Lemma  A. 3     Assume  £  =  1/2.  For  any  p  €  (0, 1),  we  have  the  inequality 

FH(P)  <  t^-Fl(p) 
1  -p 

Proof:     First  observe  that  for  any  p  G  (0, 1)  we  have 

p(a)  <p<^    f[a)  -  <p^  g(a)  <      P 


g{a)  +  1  -  '  "  1  -  p 

Simple  integration  then  yields  the  desired 


FH(P)=  I         g(a)d»L(a)<-^FL(P) 

Jp(<T)<P  1    -   P 


0 


B.  OVERTURNING 

This  appendix  is  devoted  to  the  proof  of  Lemma  7. 

Lemma  B.l  (The  Overturning  Principle)    For  any  history  h,  if  an  individual  opti- 
mally takes  action  dm,  then  the  updated  likelihood  ratio  must  satisfy 


£{h,am)  € 


■!■        7*m     •*■        Tm— 1   \  ('\R\ 


I'm  ^m—  1       / 

Proof:  Let  the  history  h  be  given.  Individual  n  uses  the  likelihood  ratio  i{h)  and  his 
private  signal  an  to  form  his  posterior  odds  C(h)/g{an)  for  state  L.  Since  he  optimally 
chose  action  am,  we  know  from  the  definition  of  the  fm's  and  the  tie-breaking  rule,  that 

fm-1  <  1  +  1(h)/ g(on)  ~  fm 
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or.  rewritten. 

T^XW/JW^  (19) 

Denote  by  T,(h)  the  set  of  signals  an  that  satisfy  (19)  for  a  given  h. 

To  form  i(h.  am),  individual  n  +  l  must  calculate  the  probability  that  individual  n  would 
take  action  am.  conditional  on  h.  Individual  n+l  knows  that  his  predecessor  would  take 
action  am  exactly  when  his  an  €  H(h).  So  he  can  calculate 

<*Z(h)=  I      gdvL  (20a) 


o£(/i)  =  [      dfx 


[20b) 


and  form  the  likelihood  ratio 


Hh.am)  =  m^nr,  (2i: 
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But  by  definition  of  E(/i),  we  also  know  that2 

/      gdliL>-^-i{h)f      V 

ys(h)  l  -  rm_!        ys(h) 

/     gdnL<-^Uh)  [     dnL 
Ji.{h)  1  -  rm  7e(/.) 

Finally.  (20a).  (20b).  (21),  and  the  above  imply  that 


1  -  rm_i       »,,        .       1  —  r„ 
>  £{n,am)  >  — — 


'm-l 

just  as  claimed.  <0> 

C.  STABILITY  OF  A  STOCHASTIC  DIFFERENCE 

EQUATION 

In  this  appendix,  we  first  develop  a  global  stability  criterion  for  linear  difference  equa- 
tions. We  then  use  that  result  to  derive  a  stability  criterion  for  linear  dynamics.  Finally, 
we  derive  a  result  on  local  stability  of  a  nonlinear  dynamical  system. 

Consider  linear  stochastic  difference  equations  of  the  following  form.  Let  an  i.i.d.  stochas- 
tic process  (yn)  be  given,  such  that  Pr(yn  =  1)  =p  —  1  —  Pr(yn  =  0).  Define  the  stochastic 
process  (n  on  R  as  follows:  £0  is  given,  and 

.    _  I   a£n-i    if  yn  =  1  ^o) 

tn  ~  1   64-i     if  Vn  =  0  K     ' 


27 Notice  that  the  strict  inequality  only  survives  the  integration  under  the  assumption  that  E(/i)  has  a 
positive  measure  under  [iL .  Otherwise,  both  sides  equal  zero. 
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where  a  and  b  are  fixed  real  constants.  If  Yn  =  YX=\  lln-  then  the  solution  to  the  difference 
equation  is  described  by 

£n  =  aYnbn-y"£0  (23) 

The  following  result  is  a  rather  straightforward  generalization  of  the  standard  stability 
criterion  for  linear  difference  equations: 

Lemma  C.l  (Global  Stability)    //  |a|p|6|1_p  <  1  then  tn  ->  0  almost  surely,  while  if 
|a|p|/;|1~p  >  1  then  \£n\  — >  oc  almost  surely. 

Proof:     The  essence  of  the  proof  lies  in  the  fact  that  by  the  Strong  Law  of  Large  Numbers. 

a.s.    i.  *n 

p  =   lim  — 

n-HX>    n 

But  if  |a|p|6|l_p  <  1  and  Yn/n  — >  p.  then  there  exists  s  >  0  and  some  N  such  that  for  all 
n  >  A*. 

\a\  "  |o|    »     <  1  —  e 

Now.  use  (23)  to  see  that 


\en\  =  \a^bn-Y^0\  =  (\a\^\b\^)n\£Q\ 


which  in  turn  implies  that  £n  — >  0  a.s.  The  rest  of  the  lemma  follows  similarly.  0 

This  criterion  deserve  a  few  comments.  One  might  imagine  that  the  arithmetic  mean, 
and  not  the  geometric  mean,  of  the  coefficients,  namely  pa  +  (1  —p)b,  would  determine  the 
behavior  of  a  linear  system.  In  the  standard  theory  of  difference  equations  p  =  1.  and  so 
these  two  averages  coincide.  If  we  reformulate  the  criterion  by  first  taking  logarithms,  as 

plog(|a|))  +  (l-p)log(|6|))<0, 

then  this  is  reminiscent  of  stability  results  from  the  theory  differential  equations,  and 
it  is  common  for  the  logarithm  to  enter  when  translating  from  difference  to  differential 
equations. 

It  is  straightforward  to  generalize  Lemma  C.l  to  the  case  of  more  than  two  continuations, 
i.e.  where  (/„  has  arbitrary  finite  support.  The  analysis  for  multidimensional  £n  is  also  of 
some  interest,  but  unfortunately  in  that  case  only  one  half  of  Lemma  C.l  goes  through. 
Indeed,  let  £n  G  Rn  and  assume 

_  f  A£n-X     if  yn  =  1 

n  ~\£4-i    ifyn  =  o 

where  A  and  B  are  given  real  n  x  n  matrices.  Let  ||.4||  and  ||£||  denote  the  operator 
norms  of  the  matrices.28  Then  the  following  half  of  Lemma  C.l  goes  through,  with  nearly 
unchanged  proof:  If  ||^||p||S||1-p  <  1  then  ln  ->  0  a.s.  Since  this  is  the  only  part  of 
Lemma  C.l  that  is  used  in  the  sequel,  our  local  stability  assertions  will  also  be  valid  in  a 
multidimensional  space,  too. 

Next  we  shall  provide  a  different  stability  result,  which  is  helpful  for  the  later  charac- 
terization of  non-linear  systems.  We  consider  system  (22)  once  more. 


28That  is.  ||.4||  =  suP|l|=1  |Ax|. 
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Lemma  C.2  //  |a|p|6|l  p  <  1  and  .\f0  is  any  open  neighborhood  about  0.  then  there  is  a 
positive  probability  that  yn  G  .A/o  for  all  n,  provided  y0  G  M0. 

Proof:  First,  if  a  or  b  is  0.  then  there  is  a  positive  probability  of  jumping  to  0  immediately, 
and  the  system  will  stay  there.  So.  assume  now  that  a,  b,  £0  /  0.  We  already  know  from 
the  previous  lemma  that  ln  — >  0  almost  surely.  To  be  explicit,  this  means  that  for  all 
.V  G  N.  we  have 

Pr(    U       n   {-en":||4||<l/:V})   =1 
\M€?i    n>.\f  J 

There  must  then  be  some  M  G  N  such  that 

Pr  ({w  G  Q"  :  Vn  >  M,  ||^||  <  l/.V})  >  0 

In  particular,  so  long  as  £A/  G  (  —  l/.V.  1/Ar).  there  is  a  positive  chance  of  tn  G  (  —  l/.V,  l/.V) 
for  all  n  >  .\/.  Also,  for  given  f.0,  supp(£,\f)  is  finite,  and  (:Q  ^  0  implies  ^.w  /  0;  therefore, 
there  exists  lM  G  (-1/-V,  1/.V)\{0}  such  that  £n  G  (-1/.V,  1/JV)  for  all  n  >  M  if  lM  =  £.w. 
The  proof  is  closed  by  appealing  to  two  simple  invariance  properties  of  the  problem. 
First,  time  invariance  allows  us  to  conclude  that  tn  G  {  —  \/N,  l/.V)  for  all  n  if  £0  =  &m- 
Second,  the  equations  are  linear,  so  that  if  £0  <  Im  then  tn  G  (  —  (to/£\r)/N,  {£q/£m)/N) 
for  all  n.  But  finally  notice  that  from  any  point  of  a  large  neighborhood,  one  can  reach  the 
inner  neighborhood  in  only  a  finite  number  of  steps,  which  occurs  with  positive  probability 
too.  This  completes  the  proof.  <C> 

We  next  use  these  results  to  investigate  the  local  stability  of  non-linear  stochastic  dy- 
namical systems.  While  we  have  so  far  has  proceeded  on  a  very  general  level,  we  now  take 
advantage  of  the  far  more  special  assumptions  relevant  for  the  application  in  section  6. 
This  is  only  slight  overkill,  since  the  appendix  does  not  require  the  martingale  property. 

In  the  notation  of  section  3.  the  system  is  now  described  by 

_  f  v?(l,4-i)    if  2/n  =  l  ,9d) 

tn-\  ^(2.^.0    ifyn=0  lZ4j 

where  y(l,  ■)  and  <p(2,  •)  are  given  functions,  and  £<j  is  given.  Moreover,  we  shall  now 
abandon  the  i.i.d.  assumption  on  the  stochastic  process  (yn),  and  posit  instead  that 

Pr(yn  =  1)  =  v(l|*»-0  =  1  "  Pr(2/n  =  0) 

We  are  concerned  with  stationary  points  I  of  (24).  namely  the  solutions  to 

<p(l,i)  =  iand<p(2,t)  =  i  (25) 

Proposition  C.l  (Local  Stability)  Fix  a  stationary  point  £  given  by  (25).  Assume 
that  at  £,  ^(1,  •)  and  ip(2,  ■)  are  both  continuously  differentiate,  while  w{\\-)  is  continuous. 
Suppose  also  that  I  satisfies  0(1|£)  G  (0,1),  ipt{l,l)  >  0,<Pt(2,i)  >  0,  <p(lj)  #  <p(2J), 
and  ip{l\£)ife(lj)  +  (1  -  0(1|£))<^(2,£)  =  1.  Then  there  is  a  neighborhood  around  I,  such 
that  from  any  point  in  this  neighborhood  there  is  a  positive  probability  that  £n  ->  t. 
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Proof:  We  proceed  as  follows.  First,  we  linearize  (and  majorize)  the  nonlinear  dynamical 
system  around  £  by  a  linear  stochastic  difference  equation  of  the  form  just  treated  (that 
satisfies  the  conclusions  of  Lemma  C.2).  Next  we  argue  that  the  conclusion  of  Lemma  C.2 
must  apply  to  the  original  non-linear  dynamical  system. 

Since  0(1|£)^(1,O  +  (1  -  i'(l\£))w{2.£)  =  1  and  (pt{l,i)  #  -M2,i),  the  arithmetic 
mean-geometric  mean  inequality  yields 

^(lJ)*(1|')V?«(2,0a~*(1|'))  <  1.  (26) 

By  the  continuity  assumptions  on  <pi(l,-)  and  <^(2,  ■),  this  inequality  obtains  in  a  neigh- 
borhood of  £.   As  ^(1.^)  ^  ipe(2,l)  and  their  average  is  1,  we  may  assume  WLOG  that 

We  now  claim  that  there  are  constants  a.b.p  >  0.  and  a  small  enough  neighborhood 
A'U)  around  L  such  that  for  all  £  G  M"(£): 

apbl-p  <  1 

0  <  ipe(l,e)  <  a  <  1  <  yt(2,l)  <  b 

0  <p<  ip{l\£) 

This  may  not  be  obvious,  so  let  us  spell  it  out.  Over  any  compact  neighborhood  of  £  we 
can  separately  maximize  <pe(l,t),<pi(2,t)  and  —  ip(l\£).  If  we  substitute  the  three  maxima 
a,  6,  and  —  p  respectively,  in  (26),  then  continuity  tells  us  that  its  left  side  converges  to 

(<Mi,-))p(^(2,-))(1-p) 

as  the  compact  neighborhoods  get  small  enough. 

Fix  £q  G  M{£).  There  clearly  exists  a  sequence  of  i.i.d.  stochastic  variables  (crn),  each 
uniformly  distributed  on  [0, 1],  such  that  yn  =  1  exactly  when  an  <  w{l\£n-i)  and  yn  =  0 
otherwise.  Introduce  a  new  stochastic  process  (yn)  defined  by  yn  =  1  when  an  <  p.  and 
yn  =  0  otherwise.  Use  this  to  define  a  new  stochastic  process  (£n)  by  £0  =  £q  —  £■.  and 

i  -£={ a^"n_1  ~ ^  if ^n  =  1 

We  now  argue  that  the  linear  process  (£n)  majorizes  the  non-linear  system  (£„).  Observe 
that  (yn)  are  independent  because  (an)  are.  Thus  Lemma  C.l  is  valid,  and  tells  us  that 
£n  ->  £  a.s.,  while  Lemma  C.2  asserts  that  there  is  a  positive  probability  that  £n  G  M(£) 
for  all  n.  So  consider  just  such  a  realization  of  (an)  whereby  £n  G  M{£)  for  all  n  and 
In  -¥  0.  Because  y{l\l)  >  p  when  £  G  Af{l),  we  have  yn  =  1  =»  Vn  =  1  •  Thus 
0  <  <fii(l,t)  <  a  <  1  <  <fi{2,£)  <  b  yields  the  desired  majorization  for  all  n  (in  this 
realization),  namely 

\\in-i\\>\\t»-i\\ 

For  no  matter  what  is  the  outcome  of  t/n,  ln  is  always  moved  further  away  from  £  than 
is  £n  (as  seen  in  figures  2  and  3).  So,  for  any  such  realization  of  (an),  £n  -+  L  We  thus 
conclude  that  £n  ->  £  with  positive  probability.  0 
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Figure  2:  Dominance  Argument  I.  This  depicts  how  an  iteration  under  r~(l.-)  brings 
the  image  closer  to  the  stationary  point  I  than  an  iteration  under  the  linearization  with 
slope  a,  when  tpt(l,  •)  <  a. 


diagonal 


Figure  3:  Dominance  Argument  II.  This  depicts  how  an  iteration  under  ip(2,  •)  moves 
the  image  point  closer  to  the  stationary  point  £  than  an  iteration  under  the  linearization 
with  slope  6,  when  <^(2,  •)  <  b. 


line,  slope  b 
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